Cancer-linked gene as target for chemotherapy

ABSTRACT

Cancer-linked gene sequences, and derived amino acid sequences, are disclosed along with processes for assaying potential antitumor agents based on their modulation of the expression of these cancer-linked genes. Also disclosed are antibodies that react with the disclosed polypeptides and methods of using the antibodies to treat cancerous conditions, such as by using the antibody to target cancerous cells in vivo for purposes of delivering therapeutic agents thereto. Also described are methods of diagnosis using the gene sequences.

[0001] This application claims priority of U.S. Provisional Application Serial No. 60/362,419, filed 7 Mar. 2002, the disclosure of which is hereby incorporated by reference in its entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to methods of screening cancer-linked genes and expression products for involvement in the cancer initiation and facilitation process as a means of cancer diagnosis as well as screening potential anti-cancer agents, and includes antibodies against expression products of said genes and immuno-conjugates containing such antibodies.

BACKGROUND OF THE INVENTION

[0003] Cancer-linked genes are valuable in that they indicate genetic differences between cancer cells and normal cells, such as where a gene is expressed in a cancer cell but not in a non-cancer cell, or where said gene is over-expressed or expressed at a higher level in a cancer as opposed to normal or non-cancer cell. In addition, the expression of such a gene in a normal cell but not in a cancer cell, especially of the same type of tissue, can indicate important functions in the cancerous process. For example, screening assays for novel drugs are based on the response of model cell based systems in vitro to treatment with specific compounds. Such genes are also useful in the diagnosis of cancer and the identification of a cell as cancerous. Gene activity is readily measured by measuring the rate of production of gene products, such as RNAs and polypeptides encoded by such genes. Where genes encode cell surface proteins, appearance of, or alterations in, such proteins, as cell surface markers, are an indication of neoplastic activity. Some such screens rely on specific genes, such as oncogenes (or gene mutations). In accordance with the present invention, a cancer-linked gene has been identified and its putative amino acid sequence worked out. Such gene is useful in the diagnosing of cancer, the screening of anticancer agents and the treatment of cancer using such agents, especially in that these genes encode polypeptides that can act as markers, such as cell surface markers, thereby providing ready targets for anti-tumor agents such as antibodies, preferably antibodies complexed to cytotoxic agents, including apoptotic agents.

BRIEF SUMMARY OF THE INVENTION

[0004] In accordance with the present invention, there is provided herein a cancer specific gene, linked especially to breast or liver cancer, or otherwise involved in the cancer initiating and facilitating process and the derived amino acid sequence thereof, including a number of different transcripts derived from said gene.

[0005] In one aspect, the present invention relates to a process for identifying an agent that modulates the activity of a cancer-related gene comprising:

[0006] (a) contacting a compound with a cell containing a gene that corresponds to a polynucleotide having a sequence selected from the group consisting of the polynucleotide sequences of SEQ ID NO: 1-27 and under conditions promoting the expression of said gene; and

[0007] (b) detecting a difference in expression of said gene relative to when said compound is not present

[0008] thereby identifying an agent that modulates the activity of a cancer-related gene.

[0009] In various embodiments of such a process, the cell is a cancer cell and the difference in expression is a decrease in expression. Such polynucleotides may also include those that have sequences identical to polynucleotide sequences of SEQ ID NO: 1-27.

[0010] In another aspect, the present invention relates to a process for identifying an anti-neoplastic agent comprising contacting a cell exhibiting neoplastic activity with a compound first identified as a cancer related gene modulator using an assay process disclosed herein and detecting a decrease in said neoplastic activity after said contacting compared to when said contacting does not occur. Such neoplastic activity may include accelerated cellular replication and/or metastasis, and the decrease in neoplastic activity preferably results from the death of the cell, or senescence, terminal differentiation or growth inhibition.

[0011] The present invention also relates to a process for identifying an anti-neoplastic agent comprising administering to an animal exhibiting a cancer condition an effective amount of an agent first identified according to a process of one of one of the assays disclosed according to the invention and detecting a decrease in said cancerous condition.

[0012] The present invention further relates to a process for determining the cancerous status of a cell, comprising determining an increase in the level of expression in said cell of at least one gene that corresponds to a polynucleotide having a sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27 wherein an elevated expression relative to a known non-cancerous cell indicates a cancerous state or potentially cancerous state. Such elevated expression may be due to an increased copy number.

[0013] The present invention additionally relates to an isolated polypeptide, encoded by one of the polynucleotide transcripts disclosed herein, comprising an amino acid sequence homologous to an amino acid selected from the group consisting of polypeptide sequences of SEQ ID NO: 28-29 wherein any difference between said amino acid sequence and the sequence of polypeptide sequences of SEQ ID NO: 28-29 is due solely to conservative amino acid substitutions and wherein said isolated polypeptide comprises at least one immunogenic fragment. In a preferred embodiment, the present invention encompasses an isolated polypeptide comprising an amino acid sequence homologous to an amino acid selected from the group consisting of polypeptide sequences of SEQ ID NO: 28-29.

[0014] The present invention also relates to an antibody that reacts with a polypeptide as disclosed herein, preferably a polypeptide comprising an amino acid sequence selected from the group consisting of polypeptide sequences of SEQ ID NO: 28-29. Such an antibody may be polyclonal, monoclonal, recombinant or synthetic in origin.

[0015] In one such embodiment, said antibody is associated, either covalently or non-covalently, with a cytotoxic agent, for example, an apoptotic agent. Thus, the present invention relates to an immuno-conjugate comprising an antibody of the invention and a cytotoxic agent. In a preferred embodiment, the cytotoxic agent is a calicheamicin, a maytansinoid, an adozelesin, DC1, a cytotoxic protein, a taxol, a taxotere, or a taxoid. In especially preferred embodiments, the calicheamicin is calicheamicin γ₁ ^(l), N-acetyl gamma calicheamicin dimethyl hydrazide or calicheamicin θ₁ ^(l), the maytansinoid is DM1, the cytotoxic protein is ricin, abrin, gelonin, pseudomonas exotoxin or diphtheria toxin, the taxol is paclitaxel, and the taxotere is docetaxel.

[0016] The present invention also relates to a process for treating cancer comprising contacting a cancerous cell with an agent having activity against an expression product encoded by a gene sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27. In one such embodiment, the cancerous cell is contacted in vivo. In another such embodiment, said agent has affinity for said expression product. In a preferred embodiment, such agent is an antibody disclosed herein, such as an antibody that is specific or selective for, or otherwise reacts with, a polypeptide of the invention. In a preferred embodiment, the expression product is a polypeptide incorporating an amino acid sequence selected from polypeptide sequences of SEQ ID NO: 28-29.

[0017] The present invention further encompasses an immunogenic composition comprising a polypeptide disclosed herein, as well as compositions formed using antibodies specific for these polypeptides.

[0018] The present invention is also directed to uses of such compositions. Such uses include a method for treating cancer in an animal afflicted therewith comprising administering to said animal an amount of an immunogenic composition of one or more of the polypeptides disclosed herein where such amount is an amount sufficient to elicit the production of cytotoxic T lymphocytes specific for a polypeptide of the invention, preferably a polypeptide incorporating a sequence of polypeptide sequences of SEQ ID NO: 28-29. In a preferred embodiment, the animal to be so treated is a human patient.

[0019] The present invention presents assays for identifying agents, including small organic compounds, having anti-neoplastic activity and thereby also affords a process for treating a cancerous condition in an animal afflicted therewith comprising administering to said animal a therapeutically effective amount of such an agent, preferably one first identified as having anti-neoplastic activity using an assay process of the invention and subsequently administering said agent to a test animal to confirm such activity. Such agents may likewise be used to protect an animal, such as a human patient at risk of developing cancer, from developing such a disease.

DEFINITIONS

[0020] As used herein, the terms “portion,” “segment,” and “fragment,” when used in relation to polypeptides, refer to a continuous sequence of residues, such as amino acid residues, which sequence forms a subset of a larger sequence. For example, if a polypeptide were subjected to treatment with any of the common endopeptidases, such as trypsin or chymotrypsin, the oligopeptides resulting from such treatment would represent portions, segments or fragments of the starting polypeptide. When used in relation to a polynucleotides, such terms refer to the products produced by treatment of said polynucleotides with any of the common endonucleases.

[0021] As used herein, the term “isolated” means that the material is removed from its original environment (e.g., the natural environment if it is naturally occurring). It could also be produced recombinantly and subsequently purified. For example, a naturally-occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or polypeptide, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides, for example, those prepared recombinantly, could be part of a vector and/or such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment. In one embodiment of the present invention, such isolated, or purified, polypeptide is useful in generating antibodies for practicing the invention, or where said antibody is attached to a cytotoxic or cytolytic agent, such as an apoptotic agent.

[0022] The term “percent identity” or “percent identical,” when referring to a sequence, means that a sequence is compared to a claimed or described sequence after alignment of the sequence to be compared (the “Compared Sequence”) with the described or claimed sequence (the “Reference Sequence”). The Percent Identity is then determined according to the following formula:

Percent Identity=100[1−(C/R)]

[0023] wherein C is the number of differences between the Reference Sequence and the Compared Sequence over the length of alignment between the Reference Sequence and the Compared Sequence wherein (i) each base or amino acid in the Reference Sequence that does not have a corresponding aligned base or amino acid in the Compared Sequence and (ii) each gap in the Reference Sequence and (iii) each aligned base or amino acid in the Reference Sequence that is different from an aligned base or amino acid in the Compared Sequence, constitutes a difference; and R is the number of bases or amino acids in the Reference Sequence over the length of the alignment with the Compared Sequence with any gap created in the Reference Sequence also being counted as a base or amino acid.

[0024] If an alignment exists between the Compared Sequence and the Reference Sequence for which the percent identity as calculated above is about equal to or greater than a specified minimum Percent Identity then the Compared Sequence has the specified minimum percent identity to the Reference Sequence even though alignments may exist in which the hereinabove calculated Percent Identity is less than the specified Percent Identity.

[0025] As known in the art “similarity” between two polypeptides is determined by comparing the amino acid sequence and its conserved amino acid substitutes of one polypeptide to the sequence of a second polypeptide.

[0026] In accordance with the present invention, the term “DNA segment” or “DNA sequence” refers to a DNA polymer, in the form of a separate fragment or as a component of a larger DNA construct, which has been derived from DNA isolated at least once in substantially pure form, i.e., free of contaminating endogenous materials and in a quantity or concentration enabling identification, manipulation, and recovery of the segment and its component nucleotide sequences by standard biochemical methods, for example, using a cloning vector. Such segments are provided in the form of an open reading frame uninterrupted by internal non-translated sequences, or introns, which are typically present in eukaryotic genes. Sequences of non-translated DNA may be present downstream from the open reading frame, where the same do not interfere with manipulation or expression of the coding regions.

[0027] The term “coding region” refers to that portion of a gene which either naturally or normally codes for the expression product of that gene in its natural genomic environment, i.e., the region coding in vivo for the native expression product of the gene. The coding region can be from a normal, mutated or altered gene, or can even be from a DNA sequence, or gene, wholly synthesized in the laboratory using methods well known to those of skill in the art of DNA synthesis.

[0028] In accordance with the present invention, the term “nucleotide sequence” refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the proteins provided by this invention are assembled from cDNA fragments and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial, eukaryotic or viral operon.

[0029] The term “expression product” means that polypeptide or protein that is the natural translation product of the gene and any nucleic acid sequence coding equivalents resulting from genetic code degeneracy and thus coding for the same amino acid(s).

[0030] The term “active fragment,” when referring to a coding sequence, means a portion comprising less than the complete coding region whose expression product retains essentially the same biological function or activity as the expression product of the complete coding region.

[0031] The term “primer” means a short nucleic acid sequence that is paired with one strand of DNA and provides a free 3′-OH end at which a DNA polymerase starts synthesis of a deoxyribonucleotide chain.

[0032] The term “promoter” means a region of DNA involved in binding of RNA polymerase to initiate transcription. The term “enhancer” refers to a region of DNA that, when present and active, has the effect of increasing expression of a different DNA sequence that is being expressed, thereby increasing the amount of expression product formed from said different DNA sequence.

[0033] The term “open reading frame (ORF)” means a series of triplets coding for amino acids without any termination codons and is a sequence (potentially) translatable into protein.

[0034] As used herein, reference to a “DNA sequence” includes both single stranded and double stranded DNA. Thus, the specific sequence, unless the context indicates otherwise, refers to the single strand DNA of such sequence, the duplex of such sequence with its complement (double stranded DNA) and the complement of such sequence.

[0035] As used herein, “corresponding genes” refers to genes that encode an RNA that is at least 90% identical, preferably at least 95% identical, most preferably at least 98% identical, and especially identical, to an RNA encoded by one of the nucleotide sequences disclosed herein (i.e., polynucleotide sequences of SEQ ID NO: 1-27). Such genes will also encode the same polypeptide sequence as any of the sequences disclosed herein, preferably polynucleotide sequences of SEQ ID NO: 1-27, but may include differences in such amino acid sequences where such differences are limited to conservative amino acid substitutions, such as where the same overall three dimensional structure, and thus the same antigenic character, is maintained. Thus, amino acid sequences may be within the scope of the present invention where they react with the same antibodies that react with polypeptides comprising the sequences of polypeptide sequences of SEQ ID NO: 28-29. A “corresponding gene” includes splice variants thereof.

[0036] The genes identified by the present disclosure are considered “cancer-related” genes, as this term is used herein, and include genes expressed at higher levels (due, for example, to elevated rates of expression, elevated extent of expression or increased copy number) in cancer cells relative to expression of these genes in normal (i.e., non-cancerous) cells where said cancerous state or status of test cells or tissues has been determined by methods known in the art, such as by reverse transcriptase polymerase chain reaction (RT-PCR) as described in the Examples herein. In specific embodiments, this relates to the genes whose sequences correspond to the polynucleotide sequences of SEQ ID NO: 1-27.

[0037] As used herein, the term “conservative amino acid substitutions” are defined herein as exchanges within one of the following five groups:

[0038] I. Small aliphatic, nonpolar or slightly polar residues:

[0039] Ala, Ser, Thr, Pro, Gly;

[0040] II. Polar, negatively charged residues and their amides:

[0041] Asp, Asn, Glu, Gln;

[0042] III. Polar, positively charged residues:

[0043] His, Arg, Lys;

[0044] IV. Large, aliphatic, nonpolar residues:

[0045] Met Leu, Ile, Val, Cys

[0046] V. Large, aromatic residues:

[0047] Phe, Tyr, Trp

DETAILED SUMMARY OF THE INVENTION

[0048] The present invention relates to processes for utilizing a nucleotide sequence for a cancer-linked gene, polypeptides encoded by such sequences and antibodies reactive with such polypeptides in methods of treating and diagnosing cancer, preferably breast or liver cancer, and in carrying out screening assays for agents effective in reducing the activity of cancer-linked genes and thereby treating a cancerous condition.

[0049] The polypeptides disclosed herein incorporate various polynucleotide transcripts (polynucleotide sequences of SEQ ID NO: 1-27) and the derived amino acid sequence (polypeptide sequences of SEQ ID NO: 28-29) from said transcripts are available as targets for chemotherapeutic agents, especially anti-cancer agents, including antibodies specific for said polypeptides.

[0050] The cancer-related polynucleotide sequences disclosed herein correspond to gene sequences whose expression is indicative of the cancerous status of a given cell. Such sequences are substantially identical to polynucleotide sequences of SEQ ID NO: 1-27, which represent different transcripts identified from the GenBank EST database and which exhibit cancer-specific expression. The polynucleotides of the invention are those that correspond to a sequence of polynucleotide sequences of SEQ ID NO: 1-27. Such sequences have been searched within the GenBank database, especially the EST database, wherein the nucleotide sequences and derived polypeptides have the characteristics and sequences shown in SEQ ID NO: 1-29.

[0051] The nucleotides and polypeptides, as gene products, used in the processes of the present invention may comprise a recombinant polynucleotide or polypeptide, a natural polynucleotide or polypeptide, or a synthetic polynucleotide or polypeptide, or a recombinant polynucleotide or polypeptide.

[0052] Fragments of such polynucleotides and polypeptides as are disclosed herein may also be useful in practicing the processes of the present invention. For example, a fragment, derivative or analog of the polypeptide (polypeptide sequences of SEQ ID NO: 28-29) may be (i) one in which one or more of the amino acid residues are substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino acid residue) and such substituted amino acid residue may or may not be one encoded by the genetic code, or (ii) one in which one or more of the amino acid residues includes a substituent group, or (iii) one in which the mature polypeptide is fused with another compound, such as a compound to increase the half-life of the polypeptide (for example, polyethylene glycol), or (iv) one in which the additional amino acids are fused to the mature polypeptide, such as a leader or secretory sequence or a sequence which is employed for purification of the mature polypeptide (such as a histidine hexapeptide) or a proprotein sequence. Such fragments, derivatives and analogs are deemed to be within the scope of those skilled in the art from the teachings herein.

[0053] In another aspect, the present invention relates to an isolated polypeptide, including a purified polypeptide, comprising an amino acid sequence at least 90% identical to the amino acid sequence of polypeptide sequences of SEQ ID NO: 28-29. In preferred embodiments, said isolated polypeptide comprises an amino acid sequence having sequence identity of at least 95%, preferably at least about 98%, and especially is identical to, the sequence of polypeptide sequences of SEQ ID NO: 28-29. The present invention also includes isolated active fragments of such polypeptides where said fragments retain the biological activity of the polypeptide or where such active fragments are useful as specific targets for cancer treatment, prevention or diagnosis. Thus, the present invention relates to any polypeptides, or fragments thereof, with sufficient sequence homology to the sequences disclosed herein as to be useful in the production of antibodies that react with (i.e., are selective or specific for) the polypeptides of polynucleotide sequences of SEQ ID NO: 1-27 so as to be useful in targeting cells that exhibit such polypeptides, or fragments, on their surfaces, thereby providing targets for such antibodies and therapeutic agents associated with such antibodies.

[0054] The polynucleotides and polypeptides useful in practicing the processes of the present invention may likewise be obtained in an isolated or purified form. In addition, the polypeptide disclosed herein as being useful in practicing the processes of the invention are believed to be surface proteins present on cells, such as cancerous cells. Precisely how such cancer-linked proteins are used in the processes of the invention may thus differ depending on the therapeutic approach used. For example, cell-surface proteins, such as receptors, are desirable targets for cytotoxic antibodies that can be generated against the polypeptides disclosed herein.

[0055] The sequence information disclosed herein, as derived from the GenBank submissions, can readily be utilized by those skilled in the art to prepare the corresponding full-length polypeptide by peptide synthesis. The same is true for either the polynucleotides or polypeptides disclosed herein for use in the methods of the invention.

[0056] The present invention relates to an isolated polypeptide, encoded by one of the polynucleotide transcripts disclosed herein, comprising an amino acid sequence homologous to an amino acid sequence selected from the group consisting of polypeptide sequences of SEQ ID NO: 28-29, wherein any difference between amino acid sequence in the isolated polypeptide and the sequence of polypeptide sequences of SEQ ID NO: 28-29 is due solely to conservative amino acid substitutions and wherein said isolated polypeptide comprises at least one immunogenic fragment. In a preferred embodiment, the present invention encompasses an isolated polypeptide comprising an amino acid sequence selected from the group consisting of polypeptide sequences of SEQ ID NO: 28-29.

[0057] Methods of producing recombinant cells and vectors useful in preparaing the polynucleotides and polypeptides disclosed herein are well known to those skilled in the molecular biology art. See, for example, Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, N.Y., (1989), Wu et al., Methods in Gene Biotechnology (CRC Press, New York, N.Y., 1997), and Recombinant Gene Expression Protocols, in Methods in Molecular Biology, Vol. 62, (Tuan, ed., Humana Press, Totowa, N.J., 1997), the disclosures of which are hereby incorporated by reference.

[0058] In one aspect, the present invention relates to a process for identifying an agent that modulates the activity of a cancer-related gene comprising:

[0059] (a) contacting a compound with a cell containing a gene that corresponds to a polynucleotide having a sequence selected from the group consisting of the polynucleotide sequences of SEQ ID NO: 1-27 and under conditions promoting the expression of said gene; and

[0060] (b) detecting a difference in expression of said gene relative to when said compound is not present

[0061] thereby identifying an agent that modulates the activity of a cancer-related gene.

[0062] In specific embodiments of such process the cell is a cancer cell and the difference in expression is a decrease in expression. Such polynucleotides may also include those that have sequences identical to the polynucleotide sequences of SEQ ID NO: 1-27.

[0063] In another aspect, the present invention relates to a process for identifying an anti-neoplastic agent comprising contacting a cell exhibiting neoplastic activity with a compound first identified as a cancer related gene modulator using an assay process disclosed herein and detecting a decrease in said neoplastic activity after said contacting compared to when said contacting does not occur. Such neoplastic activity may include accelerated cellular replication and/or metastasis, and the decrease in neoplastic activity preferably results from the death of the cell.

[0064] The present invention also relates to a process for identifying an anti-neoplastic agent comprising administering to an animal exhibiting a cancer condition an effective amount of an agent first identified according to a process of one of one of the assays disclosed according to the invention and detecting a decrease in said cancerous condition.

[0065] In specific embodiments of the present invention, the genes useful for the invention comprise genes that correspond to polynucleotides having a sequence selected from polynucleotide sequences of SEQ ID NO: 1-27, or may comprise the sequence of any of the polynucleotides disclosed herein (where the latter are cDNA sequences).

[0066] In accordance with the present invention, such assays rely on methods of determining the activity of the gene in question. Such assays are advantageously based on model cellular systems using cancer cell lines, primary cancer cells, or cancerous tissue samples that are maintained in growth medium and treated with compounds at a single concentration or at a range of concentrations. At specific times after treatment, cellular RNAs are conveniently isolated from the treated cells or tissues, which RNAs are indicative of expression of selected genes. The cellular RNA is then divided and subjected to differential analysis that detects the presence and/or quantity of specific RNA transcripts, which transcripts may then be amplified for detection purposes using standard methodologies, such as, for example, reverse transcriptase polymerase chain reaction (RT-PCR), etc. The presence or absence, or concentration levels, of specific RNA transcripts are determined from these measurements. The polynucleotide sequences disclosed herein are readily used as probes for the detection of such RNA transcripts and thus the measurement of gene activity and expression.

[0067] The polynucleotides of the invention can include fully operational genes with attendant control or regulatory sequences or merely a polynucleotide sequence encoding the corresponding polypeptide or an active fragment or analog thereof.

[0068] Because expression of the polynucleotide sequences disclosed herein are specific to the cancerous state, useful gene modulation is downward modulation, so that, as a result of exposure to an antineoplastic agent identified by the screening assays herein, the corresponding gene of the cancerous cell is expressed at a lower level (or not expressed at all) when exposed to the agent as compared to the expression when not exposed to the agent. For example, the gene sequences disclosed herein (polynucleotide sequences of SEQ ID NO: 1-27) correspond to a gene expressed at a higher level in cells of breast or liver cancer than in normal breast or liver cells. Thus, where said chemical agent causes this gene of the tested cell to be expressed at a lower level than the same genes of the reference, this is indicative of downward modulation and indicates that the chemical agent to be tested has anti-neoplastic activity.

[0069] In carrying out the assays disclosed herein, relative antineoplastic activity may be ascertained by the extent to which a given chemical agent modulates the expression of genes present in a cancerous cell. Thus, a first chemical agent that modulates the expression of a gene associated with the cancerous state (i.e., a gene corresponding to one or more of the polynucleotide transcripts disclosed herein) to a larger degree than a second chemical agent tested by the assays of the invention is thereby deemed to have higher, or more desirable, or more advantageous, anti-neoplastic activity than said second chemical agent.

[0070] The gene expression to be measured is commonly assayed using RNA expression as an indicator. Thus, the greater the level of RNA (for example, messenger RNA or mRNA) detected the higher the level of expression of the corresponding gene. Thus, gene expression, either absolute or relative, is determined by the relative expression of the RNAs encoded by such genes.

[0071] RNA may be isolated from samples in a variety of ways, including lysis and denaturation with a phenolic solution containing a chaotropic agent (e.g., trizol) followed by isopropanol precipitation, ethanol wash, and resuspension in aqueous solution; or lysis and denaturation followed by isolation on solid support, such as a Qiagen resin and reconstitution in aqueous solution; or lysis and denaturation in non-phenolic, aqueous solutions followed by enzymatic conversion of RNA to DNA template copies.

[0072] Normally, prior to applying the processes of the invention, steady state RNA expression levels for the genes, and sets of genes, disclosed herein will have been obtained. It is the steady state level of such expression that is affected by potential anti-neoplastic agents as determined herein. Such steady state levels of expression are easily determined by any methods that are sensitive, specific and accurate. Such methods include, but are in no way limited to, real time quantitative polymerase chain reaction (PCR), for example, using a Perkin-Elmer 7700 sequence detection system with gene specific primer probe combinations as designed using any of several commercially available software packages, such as Primer Express software., solid support based hybridization array technology using appropriate internal controls for quantitation, including filter, bead, or microchip based arrays, solid support based hybridization arrays using, for example, chemiluminescent, fluorescent, or electrochemical reaction based detection systems.

[0073] The gene expression indicative of a cancerous state need not be characteristic of every cell of a given tissue. Thus, the methods disclosed herein are useful for detecting the presence of a cancerous condition within a tissue where less than all cells exhibit the complete pattern. Thus, for example, a selected gene corresponding to the sequence of: one of the polynucleotide sequences of SEQ ID NO: 1-27, may be found, using appropriate probes, either DNA or RNA, to be present in as little as 60% of cells derived from a sample of tumorous, or malignant, tissue. In a highly preferred embodiment, such gene pattern is found to be present in at least 100% of cells drawn from a cancerous tissue and absent from at least 100% of a corresponding normal, non-cancerous, tissue sample.

[0074] Expression of a gene may be related to copy number, and changes in expression may be measured by determining copy number. Such change in gene copy number may be determined by determining a change in expression of messenger RNA encoded by a particular gene sequence, especially that of polynucleotide sequences of SEQ ID NO: 1-27. Also in accordance with the present invention, said gene may be a cancer initiating or facilitating gene. In carrying out the methods of the present invention, a cancer facilitating gene is a gene that, while not directly initiating tumor formation or growth, acts, such as through the actions of its expression product, to direct, enhance, or otherwise facilitate the progress of the cancerous condition, including where such gene acts against genes, or gene expression products, that would otherwise have the effect of decreasing tumor formation and/or growth.

[0075] Although the expression of a gene corresponding to a sequence of polynucleotide sequences of SEQ ID NO: 1-27 may be indicative of a cancerous status for a given cell, the mere presence of such a gene may not alone be sufficient to achieve a malignant condition and thus the level of expression of such gene may also be a significant factor in determining the attainment of a cancerous state. Thus, it becomes essential to also determine the level of expression of a gene as disclosed herein, including substantially similar sequences, as a separate means of diagnosing the presence of a cancerous status for a given cell, groups of cells, or tissues, either in culture or in situ.

[0076] The level of expression of the polypeptides disclosed herein is also a measure of gene expression, such as polypeptides having sequence identical, or similar to, any polypeptide encoded by a sequence of polynucleotide sequences of SEQ ID NO: 1-27, especially a polypeptide whose amino acid sequence is the sequence of polypeptide sequences of SEQ ID NO: 28-29.

[0077] In accordance with the foregoing, the present invention specifically contemplates a method for determining the cancerous status of a cell to be tested, comprising determining the level of expression in said cell of a gene that includes one of the nucleotide sequences selected from the sequences of polynucleotide sequences of SEQ ID NO: 1-27, including sequences substantially identical to said sequences, or characteristic fragments thereof, or the complements of any of the foregoing and then comparing said expression to that of a cell known to be non-cancerous whereby the difference in said expression indicates that said cell to be tested is cancerous.

[0078] In accordance with the invention, although gene expression for a gene that includes as a portion thereof one of the sequences of polynucleotide sequences of SEQ ID NO: 1-27, is preferably determined by use of a probe that is a fragment of such nucleotide sequence, it is to be understood that the probe may be formed from a different portion of the gene. Expression of the gene may be determined by use of a nucleotide probe that hybridizes to messenger RNA (mRNA) transcribed from a portion of the gene other than the specific nucleotide sequence disclosed herein.

[0079] It should be noted that there are a variety of different contexts in which genes have been evaluated as being involved in the cancerous process. Thus, some genes may be oncogenes and encode proteins that are directly involved in the cancerous process and thereby promote the occurrence of cancer in an animal. In addition, other genes may serve to suppress the cancerous state in a given cell or cell type and thereby work against a cancerous condition forming in an animal. Other genes may simply be involved either directly or indirectly in the cancerous process or condition and may serve in an ancillary capacity with respect to the cancerous state. All such types of genes are deemed with those to be determined in accordance with the invention as disclosed herein. Thus, the gene determined by said process of the invention may be an oncogene, or the gene determined by said process may be a cancer facilitating gene, the latter including a gene that directly or indirectly affects the cancerous process, either in the promotion of a cancerous condition or in facilitating the progress of cancerous growth or otherwise modulating the growth of cancer cells, either in vivo or ex vivo. In addition, the gene determined by said process may be a cancer suppressor gene, which gene works either directly or indirectly to suppress the initiation or progress of a cancerous condition. Such genes may work indirectly where their expression alters the activity of some other gene or gene expression product that is itself directly involved in initiating or facilitating the progress of a cancerous condition. For example, a gene that encodes a polypeptide, either wild or mutant in type, which polypeptide acts to suppress of tumor suppressor gene, or its expression product, will thereby act indirectly to promote tumor growth.

[0080] As noted previously, polynucleotides encoding the same proteins as any of polynucleotide sequences of SEQ ID NO: 1-27, regardless of the percent identity of such sequences, are also specifically contemplated by any of the methods of the present invention that rely on any or all of said sequences, regardless of how they are otherwise described or limited. Thus, any such sequences are available for use in carrying out any of the methods disclosed according to the invention. Such sequences also include any open reading frames, as defined herein, present within the sequence of polynucleotide sequences of SEQ ID NO: 1-27.

[0081] Because a gene disclosed according to the invention “corresponds to” a polynucleotide having a sequence of polynucleotide sequences of SEQ ID NO: 1-27, said gene encodes an RNA (processed or unprocessed, including naturally occurring splice variants and alleles) that is at least 90% identical, preferably at least 95% identical, most preferably at least 98% identical to, and especially identical to, an RNA that would be encoded by, or be complementary to, such as by hybridization with, a polynucleotide having the indicated sequence. In addition, genes including sequences at least 90% identical to a sequence selected from polynucleotide sequences of SEQ ID NO: 1-27, preferably at least about 95% identical to such a sequence, more preferably at least about 98% identical to such sequence and most preferably comprising such sequence are specifically contemplated by all of the processes of the present invention. Sequences encoding the same proteins as any of these sequences, regardless of the percent identity of such sequences, are also specifically contemplated by any of the methods of the present invention that rely on any or all of said sequences, regardless of how they are otherwise described or limited. The polynucleotide sequences of the invention also include any open reading frames, as defined herein, present within any of the sequences of polynucleotide sequences of SEQ ID NO: 1-27.

[0082] The sequences disclosed herein may be genomic in nature and thus represent the sequence of an actual gene, such as a human gene, or may be a cDNA sequence derived from a messenger RNA (mRNA) and thus represent contiguous exonic sequences derived from a corresponding genomic sequence, or they may be wholly synthetic in origin for purposes of practicing the processes of the invention. Because of the processing that may take place in transforming the initial RNA transcript into the final mRNA, the sequences disclosed herein may represent less than the full genomic sequence. They may also represent sequences derived from ribosomal and transfer RNAs. Consequently, the gene as present in the cell (and representing the genomic sequence) and the polynucleotide transcripts disclosed herein, including cDNA sequences, may be identical or may be such that the cDNAs contain less than the full genomic sequence. Such genes and cDNA sequences are still considered “corresponding sequences” (as defined elsewhere herein) because they both encode the same or related RNA sequences (i.e., related in the sense of being splice variants or RNAs at different stages of processing). Thus, by way of non-limiting example only, a gene that encodes an RNA transcript, which is then processed into a shorter mRNA, is deemed to encode both such RNAs and therefore encodes an RNA complementary to (using the usual Watson-Crick complementarity rules), or that would otherwise be encoded by, a cDNA (for example, a sequence as disclosed herein). Thus, the sequences disclosed herein correspond to genes contained in the cancerous cells (here, breast or liver cancer) and are used to determine gene activity or expression because they represent the same sequence or are complementary to RNAs encoded by the gene. Such a gene also includes different alleles and splice variants that may occur in the cells used in the methods of the invention, such as where recombinant cells are used to assay for anti-neoplastic agents and such cells have been engineered to express a polynucleotide as disclosed herein, including cells that have been engineered to express such polynucleotides at a higher level than is found in non-engineered cancerous cells or where such recombinant cells express such polynucleotides only after having been engineered to do so. Such engineering includes genetic engineering, such as where one or more of the polynucleotides disclosed herein has been inserted into the genome of such cell or is present in a vector.

[0083] Such cells, especially mammalian cells, may also be engineered to express on their surfaces one or more of the polypeptides of the invention for testing with antibodies or other agents capable of masking such polypeptides and thereby removing the cancerous nature of the cell. Such engineering includes both genetic engineering, where the genetic complement of the cells is engineered to express the polypeptide, as well as non-genetic engineering, whereby the cell has been physically manipulated to incorporate a polypeptide of the invention in its plasma membrane, such as by direct insertion using chemical and/or other agents to achieve this result.

[0084] In accordance with the foregoing, the present invention includes anti-cancer agents that are themselves either polypeptides, or small chemical entities, that affect the cancerous process, including initiation, suppression or facilitation of tumor growth, either in vivo or ex vivo. Said cancer modulating agent will have the effect of decreasing gene expression.

[0085] The present invention thus also relates to a method for treating cancer comprising contacting a cancerous cell with an agent having activity against an expression product encoded by a gene sequence as disclosed herein, such as the sequence of polynucleotide sequences of SEQ ID NO: 1-27. The present invention also relates to a process for treating cancer comprising contacting a cancerous cell with an agent having activity against an expression product encoded by a gene sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27. In one such embodiment, the cancerous cell is contacted in vivo. In another such embodiment, said agent has affinity for said expression product. In a preferred embodiment, such agent is an antibody disclosed herein, such as an antibody that is specific or selective for, or otherwise reacts with, a polypeptide of the invention. In a preferred embodiment, the expression product is a polypeptide incorporating an amino acid sequence selected from polypeptide sequences of SEQ ID NO: 28-29.

[0086] The present invention is also directed to such uses of the compositions of polypeptides and antibodies disclosed herein. Such uses include a process for treating cancer in an animal afflicted therewith comprising administering to said animal an amount of an immunogenic composition of one or more of the polypeptides disclosed herein where such amount if an amount sufficient to elicit the production of cytotoxic T lymphocytes specific for a polypeptide of the invention, preferably a polypeptide incorporating a sequence of polypeptide sequences of SEQ ID NO: 28-29. In a preferred embodiment, the animal to be so treated is a human patient.

[0087] The proteins encoded by the genes disclosed herein due to their expression, or elevated expression, in cancer cells, represent highly useful therapeutic targets for “targeted therapies” utilizing such affinity structures as, for example, antibodies coupled to some cytotoxic agent. In such methodology, it is advantageous that nothing need be known about the endogenous ligands or binding partners for such cell surface molecules. Rather, an antibody or equivalent molecule that can specifically recognize the cell surface molecule (which could include an artificial peptide, a surrogate ligand, and the like) that is coupled to some agent that can induce cell death or a block in cell cycling offers therapeutic promise against these proteins. Thus, such approaches include the use of so-called suicide “bullets” against intracellular proteins. For example, monoclonal antibodies may readily by produced by methods well known in the art, for example, the method of Kohler and Milstein (see: Nature, 256:495 (1975).

[0088] With the advent of methods of molecular biology and recombinant technology, it is now possible to produce antibody molecules by recombinant means and thereby generate gene sequences that code for specific amino acid sequences found in the polypeptide structure of the antibodies. Such antibodies can be produced by either cloning the gene sequences encoding the polypeptide chains of said antibodies or by direct synthesis of said polypeptide chains, with in vitro assembly of the synthesized chains to form active tetrameric (H₂L₂) structures with affinity for specific epitopes and antigenic determinants. This has permitted the ready production of antibodies having sequences characteristic of neutralizing antibodies from different species and sources.

[0089] Regardless of the source of the antibodies, or how they are recombinantly constructed, or how they are synthesized, in vitro or in vivo, using transgenic animals, such as cows, goats and sheep, using large cell cultures of laboratory or commercial size, in bioreactors or by direct chemical synthesis employing no living organisms at any stage of the process, all antibodies have a similar overall 3 dimensional structure. This structure is often given as H₂L₂ and refers to the fact that antibodies commonly comprise 2 light (L) amino acid chains and 2 heavy (H) amino acid chains. Both chains have regions capable of interacting with a structurally complementary antigenic target. The regions interacting with the target are referred to as “variable” or “V” regions and are characterized by differences in amino acid sequence from antibodies of different antigenic specificity.

[0090] The variable regions of either H or L chains contains the amino acid sequences capable of specifically binding to antigenic targets. Within these sequences are smaller sequences dubbed “hypervariable” because of their extreme variability between antibodies of differing specificity. Such hypervariable regions are also referred to as “complementarity determining regions” or “CDR” regions. These CDR regions account for the basic specificity of the antibody for a particular antigenic determinant structure.

[0091] The CDRs represent non-contiguous stretches of amino acids within the variable regions but, regardless of species, the positional locations of these critical amino acid sequences within the variable heavy and light chain regions have been found to have similar locations within the amino acid sequences of the variable chains. The variable heavy and light chains of all antibodies each have 3 CDR regions, each non-contiguous with the others (termed L1, L2, L3, H1, H2, H3) for the respective light (L) and heavy (H) chains. The accepted CDR regions have been described by Kabat et al., J. Biol. Chem. 252:6609-6616 (1977).

[0092] In all mammalian species, antibody polypeptides contain constant (i.e., highly conserved) and variable regions, and, within the latter, there are the CDRs and the so-called “framework regions” made up of amino acid sequences within the variable region of the heavy or light chain but outside the CDRs.

[0093] The antibodies disclosed according to the invention may also be wholly synthetic, wherein the polypeptide chains of the antibodies are synthesized and, possibly, optimized for binding to the polypeptides disclosed herein as being receptors. Such antibodies may be chimeric or humanized antibodies and may be fully tetrameric in structure, or may be dimeric and comprise only a single heavy and a single light chain. Such antibodies may also include fragments, such as Fab and F(ab₂)′ fragments, capable of reacting with and binding to any of the polypeptides disclosed herein as being receptors.

[0094] In one aspect, the present invention relates to immunoglobulins, or antibodies, as described herein, that react with, especially where they are specific for, the polypeptides having amino acid sequences as disclosed herein, preferably those having an amino acid sequence of one of polypeptide sequences of SEQ ID NO: 28-29. Such antibodies may commonly be in the form of a composition, especially a pharmaceutical composition. Such antibodies, by themselves, may have therapeutic value in that they are able to bind to, and thereby tie up, surface sites on cancerous cells. Where such sites have some type of function to perform (i.e., where they are surface enzymes, or channel structures, or structures that otherwise facilitate, actively or passively, the transport of nutrients and other vital materials to the cell. Such nutrients serve to facilitate the growth and replication of the cell and molecules that bind to such sites and thereby interfere with such activities can prove to have a therapeutic effect in that the result of such binding is to remove sources of nutrients from such cells, thereby interfering with growth and replication. In like manner, such binding may serve to remove vital enzyme activities from the cell's functional repertoire, thereby also interfering with viability and/or the ability of the cell to multiply or metastasize. In addition, by binding to such surface sites, the antibodies may serve to prevent the cells from reacting to environmental agents, such as cytokines and the like, that may facilitate growth, replication and metastasis, thereby further reducing the cancerous status of such cell and ameliorating the cancerous condition in a patient, even without proving fatal to the cell or cells so affected.

[0095] The methods of the present invention also include processes wherein the cancer cell is contacted in vivo as well as ex vivo with an agent that comprises a portion, or is part of an overall molecular structure, having affinity for an expression product of a gene corresponding to a polynucleotide sequence as disclosed herein, preferably where the expression product is a cell surface structure, most preferably a polypeptide as disclosed herein, such as one that comprises an amino acid sequence of polypeptide sequences of SEQ ID NO: 28-29. In one such embodiment, said portion having affinity for said expression product is an antibody, especially where said expression product is a polypeptide or oligopeptide or comprises an oligopeptide portion, or comprises a polypeptide.

[0096] In another aspect, the present invention also relates to an antibody that reacts with a polypeptide as disclosed herein, preferably a polypeptide comprising an amino acid sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27. Such an antibody may be polyclonal, monoclonal, recombinant or synthetic in origin. In one such embodiment, said antibody is associated, either covalently or non-covalently, with a cytotoxic agent, for example, an apoptotic agent. It is thus contemplated that the antibody acts a targeted vector for guiding an associated therapeutic agent to a cancerous cell, such as a cell expressing a polypeptide homologous to, if not identical to, a polypeptide as disclosed herein.

[0097] Where the cytotoxic agent is itself a polypeptide, said may be linked directly to an antibody specific for a surface target on a cancer cell, such as where the polypeptide represents an extension of the amino acid chain of the antibody. In alternative embodiments, such molecules may be covalently linked through a linker sequence of long or short duration, such as an amino acid sequence of 5 to 10 residues in length. Where the cytotoxic agents is some small organic molecule, such as a small organic compound, or some type of apoptotic agent, this may be covalently bonded to the antibody molecule or may be attached by some other type of non-covalent linkage, including hydrophobic and electrostatic linkages. Methods for forming such linkages, especially covalent linkages, are well known to those skilled in the art.

[0098] The antibodies disclosed herein may also serve as targeting vectors for much larger structures, such as liposomes. In one such embodiment, an antibody is part of, or otherwise linked to, or associated with, a membranous structure, preferably a liposome or possibly some type of cellular organelle, which acts as a reservoir for a cytotoxic agent, such as ricin. The antibody then acts to target said liposome to a cancerous tissue in an animal, whereupon the liposome provides a source of cytotoxic agents for localized treatment of a solid tumor or other type of neoplasm.

[0099] The present invention further encompasses an immunogenic composition comprising a polypeptide disclosed herein, as well as compositions formed using antibodies specific for these polypeptides.

[0100] Methods well known in the art for making formulations are found in, for example, Remington: The Science and Practice of Pharmacy, (19th ed.) Ed. A. R. Gennaro, 1995, Mack Publishing Company, Easton, Pa. Formulations for parenteral administration may, for example, contain excipients, sterile water, or saline, polyalkylene glycols such as polyethylene glycol, oils of vegetable origin, or hydrogenated napthalenes. Biocompatible, biodegradable lactide polymer, lactide/glycolide copolymer, or polyoxyethylene-polyoxypropylene copolymers may be used to control the release of the compounds. Other potentially useful parenteral delivery systems for agonists of the invention include ethylenevinyl acetate copolymer particles, osmotic pumps, implantable infusion systems, and liposomes. Formulations for inhalation may contain excipients, or example, lactose, or may be aqueous solutions containing, for example, polyoxyethylene-9-lauryl ether, glycocholate and deoxycholate, or may be oily solutions for administration in the form of nasal drops, or as a gel. It should be noted that, where the therapeutic agent to be administered is an immunoconjugate, these sometimes contain chemical linkages that are somewhat labile in aqueous media and therefor must be stored prior to administration is a more stable environment, such as in the form of a lyophilized powder.

[0101] Such an agent can be a single molecular structure, comprising both affinity portion and anti-cancer activity portions, wherein said portions are derived from separate molecules, or molecular structures, possessing such activity when separated and wherein such agent has been formed by combining said portions into one larger molecular structure, such as where said portions are combined into the form of an adduct. Said anti-cancer and affinity portions may be joined covalently, such as in the form of a single polypeptide, or polypeptide-like, structure or may be joined non-covalently, such as by hydrophobic or electrostatic interactions, such structures having been formed by means well known in the chemical arts. Alternatively, the anti-cancer and affinity portions may be formed from separate domains of a single molecule that exhibits, as part of the same chemical structure, more than one activity wherein one of the activities is against cancer cells, or tumor formation or growth, and the other activity is affinity for an expression product produced by expression of genes related to the cancerous process or condition.

[0102] In one embodiment of the present invention, a chemical agent, such as a protein or other polypeptide, is joined to an agent, such as an antibody, having affinity for an expression product of a cancerous cell, such as a polypeptide or protein encoded by a gene related to the cancerous process, preferably a gene as disclosed herein according to the present invention, most preferably a polypeptide sequence disclosed herein. Thus, where the presence of said expression product is essential to tumor initiation and/or growth, binding of said agent to said expression product will have the effect of negating said tumor promoting activity. In one such embodiment, said agent is an apoptosis-inducing agent that induces cell suicide, thereby killing the cancer cell and halting tumor growth.

[0103] Other genes within the cancer cell that are regulated in a manner similar to that of the genes disclosed herein and thus change their expression in a coordinated way in response to chemical compounds represent genes that are located within a common metabolic, signaling, physiological, or functional pathway so that by analyzing and identifying such commonly regulated groups of genes (groups that include the gene, or similar sequences, disclosed according to the invention, one can (a) assign known genes and novel genes to specific pathways and (b) identify specific functions and functional roles for novel genes that are grouped into pathways with genes for which their functions are already characterized or described. For example, one might identify a group of 10 genes, at least one of which is the gene as disclosed herein, that change expression in a coordinated fashion and for which the function of one, such as the polypeptide encoded by the sequence disclosed herein, is known then the other genes are thereby implicated in a similar function or pathway and may thus play a role in the cancer-initiating or cancer-facilitating process. In the same way, if a gene were found in normal cells but not in cancer cells, or happens to be expressed at a higher level in normal as opposed to cancer cells, then a similar conclusion may be drawn as to its involvement in cancer, or other diseases. Therefore, the processes disclosed according to the present invention at once provide a novel means of assigning function to genes, i.e. a novel method of functional genomics, and a means for identifying chemical compounds that have potential therapeutic effects on specific cellular pathways. Such chemical compounds may have therapeutic relevance to a variety of diseases outside of cancer as well, in cases where such diseases are known or are demonstrated to involve the specific cellular pathway that is affected.

[0104] The polypeptides disclosed herein, preferably those of polypeptide sequences of SEQ ID NO: 28-29, also find use as vaccines in that, where the polypeptide represents a surface protein present on a cancer cell, such polypeptide may be administered to an animal, especially a human being, for purposes of activating cytotoxic T lymphocytes (CTLs) that will be specific for, and act to lyze, cancer cells in said animal. Where used as vaccines, such polypeptides are present in the form of a pharmaceutical composition. The present invention may also employ polypeptides that have the same, or similar, immunogenic character as the polypeptides of polypeptide sequences of SEQ ID NO: 28-29 and thereby elicit the same, or similar, immunogenic response after administration to an animal, such as an animal at risk of developing cancer, or afflicted therewith. Thus, the polypeptides disclosed according to the invention will commonly find use as immunogenic compositions.

[0105] Expression of a gene corresponding to a polynucleotide disclosed herein, when in normal tissues, may indicate a predisposition towards development of breast or liver cancer. The encoded polypeptide might then present a potentially useful cell surface target for therapeutic molecules such as cytolytic antibodies, or antibodies attached to cytotoxic, or cytolytic, agents.

[0106] The present invention specifically contemplates use of antibodies against the polypeptides encoded by the polynucleotides corresponding to the genes disclosed herein, whereby said antibodies are conjugates to one or more cytotoxic agents so that the antibodies serve to target the conjugated immunotoxins to a region of cancerous activity, such as a solid tumor. For many known cytotoxic agents, lack of selectivity has presented a drawback to their use as therapeutic agents in the treatment of malignancies. For example, the class of two-chain toxins, consisting of a binding subunit (or B-chain) linked to a toxic subunit (A-chain) are extremely cytotoxic. Thus, such agents as ricin, a protein isolated from castor beans, kills cells at very low concentrations (even less than 10⁻¹¹ M) by inactivating ribosomes in said cells (see, for example, Lord et al., Ricin: structure, mode of action, and some current applications. Faseb J, 8: 201-208 (1994), and Blättler et al., Realizing the full potential of immunotoxins. Cancer Cells, 1: 50-55 (1989)). While isolated A-chains of protein toxins that functionally resemble ricin A-chain are only weakly cytotoxic for intact cells (in the concentration range of 10⁻⁷ to 10⁻⁶ M), they are very potent cytotoxic agents inside the cells. Thus, a single molecule of the A-subunit of diphtheria toxin can kill a cell once inside (see: Yamaizumi et al., One molecule of diphtheria toxin fragment A introduced into a cell can kill the cell. Cell, 15: 245-250, 1978).

[0107] The present invention solves this selectivity problem by using antibodies specific for antigens present on cancer cells to target the cytotoxins to said cells. In addition, use of antibodies decreases toxicity because the antibodies are non-toxic until they reach the tumor and, because the cytotoxin is bound to the antibody, it is presented with less opportunity to cause damage to non-targeted tissues.

[0108] In addition, use of such antibodies alone can provide therapeutic effects on the tumor through the antibody-dependent cellular cytotoxic response (ADCC) and complement-mediated cell lysis mechanisms.

[0109] A number of recombinant immunotoxins (for example, consisting of Fv regions of cancer specific antibodies fused to truncated bacterial toxins) are well known (see, for example, Smyth et al., Specific targeting of chlorambucil to tumors with the use of monoclonal antibodies, J. Natl. Cancer Inst, 76(3):503-510 (1986); Cho et al., Single-chain Fv/folate conjugates mediate efficient lysis of folate-receptor-positive tumor cells, Bioconjug. Chem., 8(3):338-346 (1997)). As noted in the literature, these may contain, for example, a truncated version of Pseudomonas exotoxin as a toxic moiety but the toxin is modified in such a manner that by itself it does not bind to normal human cells, but it retains all other functions of cytotoxicity. Here, recombinant antibody fragments target the modified toxin to cancer cells which are killed, such as by direct inhibition of protein synthesis, or by concomitant induction of apoptosis. Cells that are not recognized by the antibody fragment, because they do not carry the cancer antigen, are not affected. Good activity and specificity has been observed for many recombinant immunotoxins in in vitro assays using cultured cancer cells as well as in animal tumor models. Ongoing clinical trials provide examples where the promising pre-clinical data correlate with successful results in experimental cancer therapy. (see, for example, Brinkmann U., Recombinant antibody fragments and immunotoxin fusions for cancer therapy, In Vivo (2000) 14:21-27).

[0110] While the safety of employing immunoconjugates in humans has been established, in vivo therapeutic results have been less impressive. Because clinical use of mouse MAbs in humans is limited by the development of a foreign anti-globulin immune response by the human host, genetically engineered chimeric human-mouse MAbs have been developed by replacing the mouse Fc region with the human constant region. In other cases, the mouse antibodies have been “humanized” by replacing the framework regions of variable domains of rodent antibodies by their human equivalents. Such humanized and engineered antibodies can even be structurally arranged to have specificities and effector functions determined by design and which characteristics do not appear in nature. The development of bispecific antibodies, having different binding ends so that more than one antigenic site can be bound, have proven useful in targeting cancer cells. Thus, such antibody specificity has been improved by chemical coupling to various agents such as bacterial or plant toxins, radionuclides or cytotoxic drugs and other agents. (see, for example, Bodey, B. et al). Genetically engineered monoclonal antibodies for direct anti-neoplastic treatment and cancer cell specific delivery of chemotherapeutic agents. Curr Pharm Des (2000) February;6(3):261-76). See also, Garnett, M. C., Targeted drug conjugates: principles and progress. Adv. Drug Deliv. Rev. (Dec. 17, 2001) 53(2):171-216; Brinkmann et al., Recombinant immunotoxins for cancer therapy. Expert Opin Biol Ther. (2001) 1(4):693-702.

[0111] Among the cytotoxic agents specifically contemplated for use as immunoconjugates according to the present invention are Calicheamicin, a highly toxic enediyne antibiotic isolated from Micromonospora echinospora ssp. Calichensis, and which binds to the minor groove of DNA to induce double strand breaks and cell death (see: Lee et al., Calicheamicins, a novel family of antitumor antibiotics. 1. Chemistry and partial structure of calichemicin g₁ . J Am Chem Soc, 109: 3464-3466 (1987); Zein et al., Calicheamicin gamma 1l: an antitumor antibiotic that cleaves double-stranded DNA site specifically, Science, 240: 1198-1201 (1988)). Useful derivatives of the calicheamicins include mylotarg and 138H11-Camθ. Mylotarg is an immunoconjugate of a humanized anti-CD33 antibody (CD33 being found in leukemic cells of most patients with acute myeloid leukemia) and N-acetyl gamma colicheamicin dimethyl hydrazide, the latter of which is readily coupled to an antibody of the present invention (in place of the anti-CD33 but which can also be humanized by substitution of human framework regions into the antibody during production as described elsewhere herein) to form an immunoconjugate of the invention. (see: Hamann et al. Gemtuzumab Ozogamicin, A Potent and Selective Anti-CD33 Antibody-Calicheamicin Conjugate for Treatment of Acute Myeloid Leukemia, Bioconjug. Chem. 13, 47-58 (2002)) For use with 138H11-Camθ, 138H11 is an anti-γ-glutamyl transferase antibody coupled to theta calicheamicin through a disulfide linkage and found useful in vitro against cultured renal cell carcinoma cells. (see: Knoll et al., Targeted therapy of experimental renal cell carcinoma with a novel conjugate of monoclonal antibody 138H11 and calicheamicin θ₁ ^(l) , Cancer Res, 60: 6089-6094 (2000) The same linkage may be utilized to link this cytotoxic agent to an antibody of the present invention, thereby forming a targeting structure for breast or liver cancer cells.

[0112] Also useful in forming the immunoconjugates of the invention is DC1, a disulfide-containing analog of adozelesin, that kills cells by binding to the minor groove of DNA, followed by alkylation of adenine bases. Adozelesin is a structural analog of CC-1065, an anti-tumor antibiotic isolated from microbial fermentation of Streptomyces zelensis, and is about 1,000 fold more toxic to cultured cell lines that other DNA interacting agents, such as cis-platin and doxorubicin. This agent is readily linked to antibodies through the disulfide bond of adozelesin. (see: Chari et al., Enhancement of the selectivity and antitumor efficacy of a CC-1065 analogue through immunoconjugate formation, Cancer Res, 55: 4079-4084 (1995)).

[0113] Maytansine, a highly cytotoxic microtubular inhibitor isolated from the shrub Maytenus serrata found to have little value in human clinical trials, is much more effective in its derivatized form, denoted DM1, containing a disulfide bond to facilitate linkage to antibodies, is up to 10-fold more cytotoxic (see: Chari et al., Immunoconjugates containing novel maytansinoids: promising anticancer drugs, Cancer Res, 52: 127-131 (1992)). These same in vitro studies showed that up to four DM1 molecules could be linked to a single immunoglobulin without destroying the binding affinity. Such conjugates have been used against breast cancer antigens, such as the neu/HER2/erbB-2 antigen. (see: Goldmacher et al., Immunogen, Inc., (2002) in press); also see Liu, C. et al., Eradication of large colon tumor xenografts by targeted delivery of maytansinoids, Proc. Natl. Acad. Sci. USA, 93, 8618-8623 (1996)). For example, Liu et al. (1996) describes formation of an immunoconjugate of the maytansinoid cytotoxin DM1 and C242 antibody, a murine IgG1 immunoglobulin, available from Pharmacia and which has affinity for a mucin-like glycoprotein variably expressed by human colorectal cancers. The latter immunoconjugate was prepared according to Chari et al., Cancer Res., 52:127-131 (1992) and was found to be highly cytotoxic against cultured colon cancer cells as well as showing anti-tumor effects in vivo in mice bearing subcutaneous COLO 205 human colon tumor xenografts using doses well below the maximum tolerated dose.

[0114] In addition, there are a variety of protein toxins (cytotoxic proteins), which include a number of different classes, such as those that inhibit protein synthesis: ribosome-inactivating proteins of plant origin, such as ricin, abrin, gelonin, and a number of others, and bacterial toxins such as pseudomonas exotoxin and diphtheria toxin.

[0115] Another useful class is the one including taxol, taxotere, and taxoids. Specific examples include paclitaxel (taxol), its analog docetaxel (taxotere), and derivatives thereof. The first two are clinical drugs used in treating a number of tumors while the taxoids act to induce cell death by inhibiting the de-polymerization of tubulin. Such agents are readily linked to antibodies through disulfide bonds without disadvantageous effects on binding specificity.

[0116] In one instance, a truncated Pseudomonas exotoxin was fused to an anti-CD22 variable fragment and used successfully to treat patients with chemotherapy-resistant hairy-cell leukemia. (see: Kreitman et al., Efficacy of the anti-CD22 recombinant immunotoxin BL22 in chemotherapy-resistant hairy-cell leukemia, N Engl J Med, 345: 241-247 (2001)) Conversely, the cancer-linked peptides of the present invention offer the opportunity to prepare antibodies, recombinant or otherwise, against the appropriate antigens to target solid tumors, preferably those of malignancies of breast or liver tissue, using the same or similar cytotoxic conjugates. Thus, many of the previously used immunoconjugates have been formed using antibodies against general antigenic sites linked to cancers whereas the antibodies formed using the peptides disclosed herein are more specific and target the antibody-cytotoxic agent to a particular tissue or organ, thus further reducing toxicity and other undesirable side effects.

[0117] In addition, the immunoconjugates formed using the antibodies prepared against the cancer-linked antigens disclosed herein can be formed by any type of chemical coupling. Thus, the cytotoxic agent of choice, along with the immunoglobulin, can be coupled by any type of chemical linkage, covalent or non-covalent, including electrostatic linkage, to form the immunoconjugates of the present invention.

[0118] When used as immunoconjugates, the antitumor agents of the present invention represent a class of pro-drugs that are relatively non-toxic when first administered to an animal (due mostly to the stability of the immunoconjugate), such as a human patient, but which are targeted by the conjugated immunoglobulin to a cancer cell where they then exhibit good toxicity. The tumor-related, associated, or linked, antigens, preferably those presented herein, serve as targets for the antibodies (monoclonal, recombinant, and the like) specific for said antigens. The end result is the release of active cytotoxic agent inside the cell after binding of the immunoglobulin portion of the immunoconjugate.

[0119] The cited references describe a number of useful procedures for the chemical linkage of cytotoxic agents to immunoglobulins and the disclosures of all such references cited herein are hereby incorporated by reference in their entirety. For other reviews see Ghetie et al., Immunotoxins in the therapy of cancer: from bench to clinic, Pharmacol Ther, 63: 209-234 (1994), Pietersz et al. The use of monoclonal antibody immunoconjugates in cancer therapy, Adv Exp Med Biol, 353:169-179 (1994), and Pietersz, G. A. The linkage of cytotoxic drugs to monoclonal antibodies for the treatment of cancer, Bioconjug Chem, 1:89-95 (1990).

[0120] Thus, the present invention provides highly useful cancer-associated antigens for generation of antibodies for linkage to a number of different cytotoxic agents which are already known to have some in vitro toxicity and possess chemical groups available for linkage to antibodies.

[0121] The present invention also relates to a process that comprises a method for producing test data comprising identifying an agent according to one of the disclosed processes for identifying such an agent (i.e., the therapeutic agents identified according to the assay procedures disclosed herein) wherein said product is the data collected with respect to said agent as a result of said identification process, or assay, and wherein said data is sufficient to convey the chemical character and/or structure and/or properties of said agent. For example, the present invention specifically contemplates a situation whereby a user of an assay of the invention may use the assay to screen for compounds having the desired enzyme modulating activity and, having identified the compound, then conveys that information (i.e., information as to structure, dosage, etc) to another user who then utilizes the information to reproduce the agent and administer it for therapeutic or research purposes according to the invention. For example, the user of the assay (user 1) may screen a number of test compounds without knowing the structure or identity of the compounds (such as where a number of code numbers are used the first user is simply given samples labeled with said code numbers) and, after performing the screening process, using one or more assay processes of the present invention, then imparts to a second user (user 2), verbally or in writing or some equivalent fashion, sufficient information to identify the compounds having a particular modulating activity (for example, the code number with the corresponding results). This transmission of information from user 1 to user 2 is specifically contemplated by the present invention.

[0122] In one embodiment of the foregoing, the present invention relates to a method for producing test data with respect to the gene modulating activity of a compound comprising:

[0123] (a) contacting a compound with a cell containing a gene that corresponds to a polynucleotide having a sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27 and under conditions promoting the expression of said gene; and

[0124] (b) detecting a difference in expression of said gene relative to when said compound is not present

[0125] (c) and producing test data with respect to the gene modulating activity of said compound based on a decrease in the expression of the determined genes that correspond to SEQ ID NO: 1-27 indicating gene modulating action.

[0126] It should be cautioned that, in carrying out the procedures of the present invention as disclosed herein, whether to form immunoconjugates or screen for other antitumor agents using the genes and polypeptides disclosed herein, any reference to particular buffers, media, reagents, cells, culture conditions and the like are not intended to be limiting, but are to be read so as to include all related materials that one of ordinary skill in the art would recognize as being of interest or value in the particular context in which that discussion is presented. For example, it is often possible to substitute one buffer system or culture medium for another and still achieve similar, if not identical, results. Those of skill in the art will have sufficient knowledge of such systems and methodologies so as to be able, without undue experimentation, to make such substitutions as will optimally serve their purposes in using the methods and procedures disclosed herein.

[0127] The present invention will now be further described by way of the following non-limiting example. In applying the disclosure of the example, it should be kept clearly in mind that other and different embodiments of the methods disclosed according to the present invention will no doubt suggest themselves to those of skill in the relevant art. The following example shows how a potential anti-neoplastic agent may be identified using one or more of the genes disclosed herein.

EXAMPLE Determination of Gene Inhibitory Activity of an Anti-Neoplastic Agent

[0128] Tumor cells are grown to a density of 10⁵ cells/cm² in Leibovitz's L-15 medium supplemented with 2 mM L-glutamine (90%) and 10% fetal bovine serum. The cells are collected after treatment with 0.25% trypsin, 0.02% EDTA at 37° C. for 2 to 5 minutes. The trypsinized cells are then diluted with 30 ml growth medium and plated at a density of 50,000 cells per well in a 96 well plate (100 μl/well). The following day, cells are treated with either compound buffer alone, or compound buffer containing a chemical agent to be tested, for 24 hours. The media is then removed, the cells lysed and the RNA recovered using the RNAeasy reagents and protocol obtained from Qiagen. RNA is quantitated and 10 ng of sample in 1 μl are added to 24 μl of Taqman reaction mix containing 1×PCR buffer, RNAsin, reverse transcriptase, nucleoside triphosphates, amplitaq gold, tween 20, glycerol, bovine serum albumin (BSA) and specific PCR primers and probes for a reference gene (18S RNA) and a test gene (Gene X). Reverse transcription is then carried out at 48° C. for 30 minutes. The sample is then applied to a Perlin Elmer 7700 sequence detector and heat denatured for 10 minutes at 95° C. Amplification is performed through 40 cycles using 15 seconds annealing at 60° C. followed by a 60 second extension at 72° C. and 30 second denaturation at 95° C. Data files are then captured and the data analyzed with the appropriate baseline windows and thresholds.

[0129] For the experiments disclosed herein, anti-neoplastic agents are commonly present in the culture medium at a concentration of about 10 μM but value can vary throughout the range of 1 through 100 μM, with lower concentrations being valuable for cases where the agent is especially effective and lower concentrations are need to make accurate determinations.

[0130] The quantitative difference between the target and reference gene is then calculated and a relative expression value determined for all of the samples used. In this way, the ability of a chemotherapeutic agent to effectively and selectively reduce the activity of a cancer-specific gene is readily ascertained. The overall expression of the cancer-specific gene, as modulated by one chemical agent relative to another, is also determined. Chemical agents having the most effect in reducing gene activity are thereby identified as the most anti-neoplastic.

REFERENCES

[0131] Walter A. Blättler and Ravi Chari: Drugs to enhance the therapeutic potency of anti-cancer antibodies: antibody-drug conjugates as tumor-activated prodrugs. In Anticancer Agents—Frontiers in Cancer Chemotherapy (Iwao Ojima, Gregory D. Vite, Karl-Heinz Altmann, Eds.), American Chemical Society, pp. 317-338 (2001).

[0132] Dan L. Longo, Patricia L. Duffey, John G. Gribben, Elaine S. Jaffe, Brendan D. Curti, Barry L. Gause, John E. Janik, Virginia M. Braman, Dixie Esseltine, Wyndham H. Wilson, Dwight Kaufman, Robert E. Wittes, Lee M. Nadler, and Walter J. Urba: Combination chemotherapy followed by an Immunotoxin (Anti-B4-blocked Ricin) in patients with indolent lymphoma: results of a Phase II study. Cancer J. 6, 146-150 (2000).

[0133] Walter A. Blättler and John M. Lambert: Preclinical immunotoxin development. In Monoclonal Antibody-Based Therapy of Cancer (M. Grossbard, Ed.), Marcel Dekker, Inc. NY, N.Y., pp. 1-22 (1998).

[0134] Ravi V. J. Chari: Targeted delivery of chemotherapeutics: tumor-activated prodrug therapy. In Advanced Drug Delivery Reviews, Elsevier Science B.V., pp. 89-104 (1998).

[0135] David T. Scadden, David P. Schenkein, Zale Bernstein, Barry Luskey, John Doweiko, Anil Tulpule, and Alexandra M. Levine: Immunotoxin combined with chemotherapy for patients with AIDS-related Non-Hodgkin's Lymphoma. Cancer 83, 2580-2587 (1998).

[0136] Changnian Liu and Ravi VJ Chari: The development of antibody delivery systems to target cancer with highly potent maytansinoids. Exp. Opi. Invest. Drugs 6, 169-172 (1997).

[0137] A. C. Goulet, Viktor S. Goldmacher, John M. Lambert, C. Baron, Dennis C. Roy and E. Kouassi: Conjugation of blocked ricin to an anti-CD19 monoclonal antibody increases antibody-induced cell calcium mobilization and CD19 internalization. Blood 90, 2364-2375 (1997).

[0138] Changnian Liu, John M. Lambert, Beverly A. Teicher, Walter A. Blättler, and Rosemary O'Connor: Cure of multidrug-resistant human B-cell lymphoma xenografts by combinations of anti-B4-blocked ricin and chemotherapeutic drugs. Blood 87, 3892-3898 (1996).

[0139] Rajeeva Singh, Lana Kats, Walter A. Blättler, and John M. Lambert: Formation of N-Substituted 2-Iminothiolanes when amino groups in proteins and peptides are modified by 2-Iminothiolane. Anal. Biochem. 236, 114-125 (1996).

[0140] Changnian Liu, B. Mitra Tadayoni, Lizabeth A. Bourret, Kristin M. Mattocks, Susan M. Derr, Wayne C. Widdison, Nancy L. Kedersha, Pamela D. Ariniello, Victor S. Goldmacher, John M. Lambert, Walter A. Blättler, and Ravi V. J. Chari:

[0141] Eradication of large colon tumor xenografts by targeted delivery of maytansinoids. Proc. Natl. Acad. Sci. USA 93, 8618-8623 (1996).

[0142] Denis C. Roy, Sophie Ouellet, Christiane Le Houiller, Pamela D. Ariniello, Claude Perreault and John M. Lambert: Elimination of neuroblastoma and small-cell lung cancer cells with an anti-neural cell adhesion molecule immunotoxin. J. Natl. Cancer Inst. 88, 1136-1145 (1996).

[0143] Walter A. Blättler, Ravi V. J. Chari and John M. Lambert: Immunoconjugates. In Cancer Therapeutics: Experimental and Clinical Agents. (B. Teicher, Ed.), Humana Press, Totowa, N.J., pp. 371-394 (1996).

[0144] Michael L Grossbard, John M. Lambert, Victor S. Goldmacher, Arnold S. Freedman, Jeanne Kinsella, Danny P. Ducello, Susan N. Rabinowe, Laura Elisea, Felice Carol, James A. Taylor, Walter A. Blättler, Carol L. Epstein, and Lee M. Nadler: Anti-B4-blocked Ricin: A phase I trial of 7 day continuous infusion in patients with B-cell neoplasms. J. Clin. Oncol. 11, 726-737 (1993).

[0145] Michael L. Grossbard, John G. Gribben, Arnold S. Freedman, John M. Lambert, Jeanne Kinsella, Susan N. Rabinowe, Laura Eliseo, James A. Taylor, Walter A. Blättler, Carol L. Epstein, and Lee M. Nadler: Adjuvant immunotoxin therapy with anti-B4-blocked ricin following autologous bone marrow transplantation for patients with B-cell Non-Hodgkin's lymphoma. Blood 81, 2263-2271 (1993).

[0146] Sudhir A. Shah, Patricia M. Halloran, Cynthia A. Ferris, Beth A. Levine, Lizabeth A. Bourret, Victor S. Goldmacher, and Walter A. Blättler: Anti-B4-blocked Ricin immunotoxin shows therapeutic efficacy in four different SCID mouse tumor models. Cancer Res. 53, 1360-1367 (1993).

[0147] Ravi V. J. Chari, Bridget A. Martell, Jonathan L. Gross, Sherilyn B. Cook, Sudhir A. Shah, Walter A. Blättler, Sara J. McKenzie, and Victor S. Goldmacher: Immunoconjugates containing novel maytansinoids: promising anti-cancer drugs. Cancer Res. 52, 127-131 (1992).

[0148] John M. Lambert, Peter D. Senter, Annie Yau-Young, Walter A. Blättler, and Victor S. Goldmacher: Purified immunotoxins that are reactive with human lymphoid cells. J. Biol. Chem. 250, 12035-12041 (1985).

1 29 1 6159 DNA Homo sapiens 1 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttt aaaatggtgc atttgtgctt ctgaactatt ttgaagagtc 4500 acttctgttt acctcaagta tcaattcatc ctccatacat ttgaattcaa gttgtttttt 4560 gtcaaattta cagttgtcaa ttgatcttca agctgcaggg tgcctagaaa tgggccgttg 4620 tctgtagccc tggcatgtgc acacggacat ttgccaccac tgcaagcaaa agtctggaga 4680 agttcaccaa cgacaagaac gattagggaa aatatgctgc tgtgggttaa caactcagaa 4740 agtccctgat ccacatttgg ctgtttacta aagcttgtga ttaacttttt ggcagtgtgt 4800 actatgctct attgctatat atgctatcta taaatgtaga tgttaaggat aagtaattct 4860 aaatttatta ttctatagtt ttgaagtttg gttaagtttc ctttcactca attgatttat 4920 tttgttgtta atcaaattta tgttaattgg atcctttaaa ttttttttgg cattttccaa 4980 caaaaatggc tttattcata agaaaggaaa aaaatcaatg gaatttgata tctaaagaag 5040 ttagaaaggg agcaaaataa aaaacataaa ggagatagat gaattagtaa gcaaatcagt 5100 agtcgagttt ttcaaactgg caaaattaat taattgactt ttagcccaaa tttacattgt 5160 taattaaatc aagaaggaag aagatctaag agctcccatt gataggcaag cctagagaga 5220 actagctaaa tttatcatgc taggatattg aaacacagaa agtttacata catttatgaa 5280 gggtcaattt agtttggaca gtgaggtatt tgtcttagtg gaaaaaagga gaattagtct 5340 gatcaaatcg tgaagtaata cagtgaactt gcaggtgcac aagtacctct tggacttctg 5400 aattgatcca gttgtcatcc accacagaca tctcacatca gatacagaca gttccaagat 5460 tgacaacaga gaacaacctg ctggaaagac ctgggcagaa atggagagcc ctgcgggaac 5520 catgctacat tttcatctaa agagagaatg cacatctgat gagactgaaa gttctttgtt 5580 gttttagatt gtagaatggt attgaattgg tctgtggaaa attgcattgc ttttatttct 5640 ttgtgtaatc aagtttaagt aataggggat atataatcat aagcatttta gggtgggagg 5700 gactattaag taattttaag tgggtggggt tatttagaat gttagaataa tattatgtat 5760 tagatatcgc tataagtgga catgcgtact tacttgtaac cctttaccct ataattgcta 5820 tccttaaaga tttcaaataa actcggaggg aactgcaggg agaccaactt atttagagcg 5880 aattggacat ggataaaaac cccagtggga gaaagttcaa aggtgattag attaataatt 5940 taatagagga tgagtgacct ctgataaatt actgctagaa tgaacttgtc aatgatggat 6000 ggtaaatttt catggaagtt ataaaagtga taaatatgcc cctacccctc cttctaactt 6060 tattgctgta ttctcttcac tctatatttc tctctatttg ctaatattgc attgctgtta 6120 caataaaaat tcaataaaga tttagtggtt aagtgcaaa 6159 2 5165 DNA Homo sapiens 2 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttt aaaatggtgc atttgtgctt ctgaactatt ttgaagagtc 4500 acttctgttt acctcaagta tcaattcatc ctccatacat ttgaattcaa gttgtttttt 4560 gtcaaattta cagttgtcaa ttgatcttca agctgcaggg tgcctagaaa tgggccgttg 4620 tctgtagccc tggcatgtgc acacggacat ttgccaccac tgcaagcaaa agtctggaga 4680 agttcaccaa cgacaagaac gattagggaa aatatgctgc tgtgggttaa caactcagaa 4740 agtccctgat ccacatttgg ctgtttacta aagcttgtga ttaacttttt ggcagtgtgt 4800 actatgctct attgctatat atgctatcta taaatgtaga tgttaaggat aagtaattct 4860 aaatttatta ttctatagtt ttgaagtttg gttaagtttc ctttcactca attgatttat 4920 tttgttgtta atcaaattta tgttaattgg atcctttaaa ttttttttgg cattttccaa 4980 caaaaatggc tttattcata agaaaggaaa aaaatcaatg gaatttgata tctaaagaag 5040 ttagaaaggg agcaaaataa aaaacataaa ggagatagat gaattagtaa gcaaatcagt 5100 agtcgagttt ttcaaactgg caaaattaat taattgactt ttagcccaaa tttacattgt 5160 taatt 5165 3 5473 DNA Homo sapiens 3 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttacttttaa aatggtgcat ttgtgcttct gaactatttt 4800 gaagagtcac ttctgtttac ctcaagtatc aattcatcct ccatacattt gaattcaagt 4860 tgttttttgt caaatttaca gttgtcaatt gatcttcaag ctgcagggtg cctagaaatg 4920 ggccgttgtc tgtagccctg gcatgtgcac acggacattt gccaccactg caagcaaaag 4980 tctggagaag ttcaccaacg acaagaacga ttagggaaaa tatgctgctg tgggttaaca 5040 actcagaaag tccctgatcc acatttggct gtttactaaa gcttgtgatt aactttttgg 5100 cagtgtgtac tatgctctat tgctatatat gctatctata aatgtagatg ttaaggataa 5160 gtaattctaa atttattatt ctatagtttt gaagtttggt taagtttcct ttcactcaat 5220 tgatttattt tgttgttaat caaatttatg ttaattggat cctttaaatt ttttttggca 5280 ttttccaaca aaaatggctt tattcataag aaaggaaaaa aatcaatgga atttgatatc 5340 taaagaagtt agaaagggag caaaataaaa aacataaagg agatagatga attagtaagc 5400 aaatcagtag tcgagttttt caaactggca aaattaatta attgactttt agcccaaatt 5460 tacattgtta att 5473 4 6214 DNA Homo sapiens 4 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttc tcaagtatca attcatcctc catacatttg aattcaagtt 4500 gttttttgtc aaatttacag ttgtcaattg atcttcaagc tgcagggtgc ctagaaatgg 4560 gccgttgtct gtagccctgg catgtgcaca cggacatttg ccaccactgc aagcaaaagt 4620 ctggagaagt tcaccaacga caagaacgat tagggaaaat atgctgctgt gggttaacaa 4680 ctcagaaagt ccctgatcca catttggctg tttactaaag cttgtgatta actttttggc 4740 agtgtgtact atgctctatt gctatatatg ctatctataa atgtagatgt taaggataag 4800 taattctaaa tttattattc tatagttttg aagtttggtt aagtttcctt tcactcaatt 4860 gatttatttt gttgttaatc aaatttatgt taattggatc ctttaaattt tttttggcat 4920 tttccaacaa aaatggcttt attcataaga aaggaaaaaa atcaatggaa tttgatatct 4980 aaagaagtta gaaagggagc aaaataaaaa acataaagga gatagatgaa ttagtaagca 5040 aatcagtagt cgagtttttc aaactggcaa aattaattaa ttgactttta gcccaaattt 5100 acattgttaa ttaaatcaag aaggaagaag atctaagagc tcccattgat aggcaagcct 5160 agagagaact agctaaattt atcatgctag gatattgaaa cacagaaagt ttacatacat 5220 ttatgaaggg tcaatttagt ttggacagtg aggtatttgt cttagtggaa aaaaggagaa 5280 ttagtctgat caaatcgtga agtaatacag tgaacttgca ggtgcacaaa ataagagggc 5340 cacatctata tggtgcagtc tggaattctg tttaagtttg taggtacctc ttggacttct 5400 gaattgatcc agttgtcatc caccacagac atctcacatc agatacagac agttccaaga 5460 ttgacaacag agaacaacct gctggaaaga cctgggcaga aatggagagc cctgcgggaa 5520 ccatgctaca ttttcatcta aagagagaat gcacatctga tgagactgaa agttctttgt 5580 tgttttagat tgtagaatgg tattgaattg gtctgtggaa aattgcattg cttttatttc 5640 tttgtgtaat caagtttaag taatagggga tatataatca taagcatttt agggtgggag 5700 ggactattaa gtaattttaa gtgggtgggg ttatttagaa tgttagaata atattatgta 5760 ttagatatcg ctataagtgg acatgcgtac ttacttgtaa ccctttaccc tataattgct 5820 atccttaaag atttcaaata aactcggagg gaactgcagg gagaccaact tatttagagc 5880 gaattggaca tggataaaaa ccccagtggg agaaagttca aaggtgatta gattaataat 5940 ttaatagagg atgagtgacc tctgataaat tactgctaga atgaacttgt caatgatgga 6000 tggtaaattt tcatggaagt tataaaagtg ataaataaaa acccttgctt ttacccctgt 6060 cagtagccct cctcctacca ctgaacccca ttgcccctac ccctccttct aactttattg 6120 ctgtattctc ttcactctat atttctctct atttgctaat attgcattgc tgttacaata 6180 aaaattcaat aaagatttag tggttaagtg caaa 6214 5 6575 DNA Homo sapiens 5 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttacttttaa aatggtgcat ttgtgcttct gaactatttt 4800 gaagagtcac ttctgtttac ctcaagtatc aattcatcct ccatacattt gaattcaagt 4860 tgttttttgt caaatttaca gttgtcaatt gatcttcaag ctgcagggtg cctagaaatg 4920 ggccgttgtc tgtagccctg gcatgtgcac acggacattt gccaccactg caagcaaaag 4980 tctggagaag ttcaccaacg acaagaacga ttagggaaaa tatgctgctg tgggttaaca 5040 actcagaaag tccctgatcc acatttggct gtttactaaa gcttgtgatt aactttttgg 5100 cagtgtgtac tatgctctat tgctatatat gctatctata aatgtagatg ttaaggataa 5160 gtaattctaa atttattatt ctatagtttt gaagtttggt taagtttcct ttcactcaat 5220 tgatttattt tgttgttaat caaatttatg ttaattggat cctttaaatt ttttttggca 5280 ttttccaaca aaaatggctt tattcataag aaaggaaaaa aatcaatgga atttgatatc 5340 taaagaagtt agaaagggag caaaataaaa aacataaagg agatagatga attagtaagc 5400 aaatcagtag tcgagttttt caaactggca aaattaatta attgactttt agcccaaatt 5460 tacattgtta attaaatcaa gaaggaagaa gatctaagag ctcccattga taggcaagcc 5520 tagagagaac tagctaaatt tatcatgcta ggatattgaa acacagaaag tttacataca 5580 tttatgaagg gtcaatttag tttggacagt gaggtatttg tcttagtgga aaaaaggaga 5640 attagtctga tcaaatcgtg aagtaataca gtgaacttgc aggtgcacaa aataagaggg 5700 ccacatctat atggtgcagt ctggaattct gtttaagttt gtaggtacct cttggacttc 5760 tgaattgatc cagttgtcat ccaccacaga catctcacat cagatacaga cagttccaag 5820 attgacaaca gagaacaacc tgctggaaag acctgggcag aaatggagag ccctgcggga 5880 accatgctac attttcatct aaagagagaa tgcacatctg atgagactga aagttctttg 5940 ttgttttaga ttgtagaatg gtattgaatt ggtctgtgga aaattgcatt gcttttattt 6000 ctttgtgtaa tcaagtttaa gtaatagggg atatataatc ataagcattt tagggtggga 6060 gggactatta agtaatttta agtgggtggg gttatttaga atgttagaat aatattatgt 6120 attagatatc gctataagtg gacatgcgta cttacttgta accctttacc ctataattgc 6180 tatccttaaa gatttcaaat aaactcggag ggaactgcag ggagaccaac ttatttagag 6240 cgaattggac atggataaaa accccagtgg gagaaagttc aaaggtgatt agattaataa 6300 tttaatagag gatgagtgac ctctgataaa ttactgctag aatgaacttg tcaatgatgg 6360 atggtaaatt ttcatggaag ttataaaagt gataaataaa aacccttgct tttacccctg 6420 tcagtagccc tcctcctacc actgaacccc attgccccta cccctccttc taactttatt 6480 gctgtattct cttcactcta tatttctctc tatttgctaa tattgcattg ctgttacaat 6540 aaaaattcaa taaagattta gtggttaagt gcaaa 6575 6 2937 DNA Homo sapiens 6 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaagag aagctgatac aaacgcagga aatgctgatt tctttatgga gggggag 2937 7 6160 DNA Homo sapiens 7 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttc tcaagtatca attcatcctc catacatttg aattcaagtt 4500 gttttttgtc aaatttacag ttgtcaattg atcttcaagc tgcagggtgc ctagaaatgg 4560 gccgttgtct gtagccctgg catgtgcaca cggacatttg ccaccactgc aagcaaaagt 4620 ctggagaagt tcaccaacga caagaacgat tagggaaaat atgctgctgt gggttaacaa 4680 ctcagaaagt ccctgatcca catttggctg tttactaaag cttgtgatta actttttggc 4740 agtgtgtact atgctctatt gctatatatg ctatctataa atgtagatgt taaggataag 4800 taattctaaa tttattattc tatagttttg aagtttggtt aagtttcctt tcactcaatt 4860 gatttatttt gttgttaatc aaatttatgt taattggatc ctttaaattt tttttggcat 4920 tttccaacaa aaatggcttt attcataaga aaggaaaaaa atcaatggaa tttgatatct 4980 aaagaagtta gaaagggagc aaaataaaaa acataaagga gatagatgaa ttagtaagca 5040 aatcagtagt cgagtttttc aaactggcaa aattaattaa ttgactttta gcccaaattt 5100 acattgttaa ttaaatcaag aaggaagaag atctaagagc tcccattgat aggcaagcct 5160 agagagaact agctaaattt atcatgctag gatattgaaa cacagaaagt ttacatacat 5220 ttatgaaggg tcaatttagt ttggacagtg aggtatttgt cttagtggaa aaaaggagaa 5280 ttagtctgat caaatcgtga agtaatacag tgaacttgca ggtgcacaaa ataagagggc 5340 cacatctata tggtgcagtc tggaattctg tttaagtttg taggtacctc ttggacttct 5400 gaattgatcc agttgtcatc caccacagac atctcacatc agatacagac agttccaaga 5460 ttgacaacag agaacaacct gctggaaaga cctgggcaga aatggagagc cctgcgggaa 5520 ccatgctaca ttttcatcta aagagagaat gcacatctga tgagactgaa agttctttgt 5580 tgttttagat tgtagaatgg tattgaattg gtctgtggaa aattgcattg cttttatttc 5640 tttgtgtaat caagtttaag taatagggga tatataatca taagcatttt agggtgggag 5700 ggactattaa gtaattttaa gtgggtgggg ttatttagaa tgttagaata atattatgta 5760 ttagatatcg ctataagtgg acatgcgtac ttacttgtaa ccctttaccc tataattgct 5820 atccttaaag atttcaaata aactcggagg gaactgcagg gagaccaact tatttagagc 5880 gaattggaca tggataaaaa ccccagtggg agaaagttca aaggtgatta gattaataat 5940 ttaatagagg atgagtgacc tctgataaat tactgctaga atgaacttgt caatgatgga 6000 tggtaaattt tcatggaagt tataaaagtg ataaatatgc ccctacccct ccttctaact 6060 ttattgctgt attctcttca ctctatattt ctctctattt gctaatattg cattgctgtt 6120 acaataaaaa ttcaataaag atttagtggt taagtgcaaa 6160 8 1012 DNA Homo sapiens 8 aaaaaagtaa atgtctttag gtgggaaatg catgcccatg ttgctatagt tgcagcccct 60 ccctctttta atcccactct ttttacacag atagtcacct ttctattttc taattaataa 120 ctttgttcat atcagctaac cattgttgaa tactcactac ataatttacc tgcactattt 180 aattaagtcc tcataattag tttgaggtat actattattg ggcttctttg aatgtcccat 240 tttatagctt ttcctaggag gcatcaatta ttatttcttg aatgactgca taatgacaaa 300 ggaaactgac atacacatat tgctgcttag tgccatttct gtcccataga gaattttctt 360 tggggattta aattatggtc agaaaagtaa tttaggcagt cctgtggttt ttgtggtggt 420 ggttgtttga tttgataacc agttaccttc tcggtggcag tggaataggg ttgtgacagt 480 tggccagagt accctcctgc ccacagggct gatgaataca tgcattgtga ttttaatact 540 cccttcttgc tttaatgaga ttctattagt tgcagtatta actatttttg gcctggcttc 600 tttaataaca aaagttatta ttttggcttt aatttgcata attgattttg tatccatgtt 660 tgccttgaag tttcttgctg tgtttgaata caaaactact tgtttagagg gtttcaacat 720 tcttttatag tttataaagg taagcatgct gttgataatg tattgttgaa tcataatgaa 780 caggtggcct ttgaggaaga caaggtcatt tgaacatcaa aaagtacttt gctttttgtt 840 atatatggtg ctgattacta actttctctt gagagtttat tttctttgtt cactaattat 900 taatgtaagc acgtagagca agggataatt taaataccaa atcctttgtg tctgcttttg 960 ctgctttgaa tgtataagtt ggatttatac atacatttat ctttatttac ag 1012 9 4315 DNA Homo sapiens 9 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactcctct tcatgtcagg caatttgaga tatacaccat 4200 gttgttttct actgctcaaa ctgtgctttg gcaattaaag cctgcctctk acatctctgt 4260 gtgtggcagc aaaatggagt ctcagcaacc agcaggaaaa tarcaggacc gagct 4315 10 6521 DNA Homo sapiens 10 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttacttttaa aatggtgcat ttgtgcttct gaactatttt 4800 gaagagtcac ttctgtttac ctcaagtatc aattcatcct ccatacattt gaattcaagt 4860 tgttttttgt caaatttaca gttgtcaatt gatcttcaag ctgcagggtg cctagaaatg 4920 ggccgttgtc tgtagccctg gcatgtgcac acggacattt gccaccactg caagcaaaag 4980 tctggagaag ttcaccaacg acaagaacga ttagggaaaa tatgctgctg tgggttaaca 5040 actcagaaag tccctgatcc acatttggct gtttactaaa gcttgtgatt aactttttgg 5100 cagtgtgtac tatgctctat tgctatatat gctatctata aatgtagatg ttaaggataa 5160 gtaattctaa atttattatt ctatagtttt gaagtttggt taagtttcct ttcactcaat 5220 tgatttattt tgttgttaat caaatttatg ttaattggat cctttaaatt ttttttggca 5280 ttttccaaca aaaatggctt tattcataag aaaggaaaaa aatcaatgga atttgatatc 5340 taaagaagtt agaaagggag caaaataaaa aacataaagg agatagatga attagtaagc 5400 aaatcagtag tcgagttttt caaactggca aaattaatta attgactttt agcccaaatt 5460 tacattgtta attaaatcaa gaaggaagaa gatctaagag ctcccattga taggcaagcc 5520 tagagagaac tagctaaatt tatcatgcta ggatattgaa acacagaaag tttacataca 5580 tttatgaagg gtcaatttag tttggacagt gaggtatttg tcttagtgga aaaaaggaga 5640 attagtctga tcaaatcgtg aagtaataca gtgaacttgc aggtgcacaa aataagaggg 5700 ccacatctat atggtgcagt ctggaattct gtttaagttt gtaggtacct cttggacttc 5760 tgaattgatc cagttgtcat ccaccacaga catctcacat cagatacaga cagttccaag 5820 attgacaaca gagaacaacc tgctggaaag acctgggcag aaatggagag ccctgcggga 5880 accatgctac attttcatct aaagagagaa tgcacatctg atgagactga aagttctttg 5940 ttgttttaga ttgtagaatg gtattgaatt ggtctgtgga aaattgcatt gcttttattt 6000 ctttgtgtaa tcaagtttaa gtaatagggg atatataatc ataagcattt tagggtggga 6060 gggactatta agtaatttta agtgggtggg gttatttaga atgttagaat aatattatgt 6120 attagatatc gctataagtg gacatgcgta cttacttgta accctttacc ctataattgc 6180 tatccttaaa gatttcaaat aaactcggag ggaactgcag ggagaccaac ttatttagag 6240 cgaattggac atggataaaa accccagtgg gagaaagttc aaaggtgatt agattaataa 6300 tttaatagag gatgagtgac ctctgataaa ttactgctag aatgaacttg tcaatgatgg 6360 atggtaaatt ttcatggaag ttataaaagt gataaatatg cccctacccc tccttctaac 6420 tttattgctg tattctcttc actctatatt tctctctatt tgctaatatt gcattgctgt 6480 tacaataaaa attcaataaa gatttagtgg ttaagtgcaa a 6521 11 6160 DNA Homo sapiens 11 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttc tcaagtatca attcatcctc catacatttg aattcaagtt 4500 gttttttgtc aaatttacag ttgtcaattg atcttcaagc tgcagggtgc ctagaaatgg 4560 gccgttgtct gtagccctgg catgtgcaca cggacatttg ccaccactgc aagcaaaagt 4620 ctggagaagt tcaccaacga caagaacgat tagggaaaat atgctgctgt gggttaacaa 4680 ctcagaaagt ccctgatcca catttggctg tttactaaag cttgtgatta actttttggc 4740 agtgtgtact atgctctatt gctatatatg ctatctataa atgtagatgt taaggataag 4800 taattctaaa tttattattc tatagttttg aagtttggtt aagtttcctt tcactcaatt 4860 gatttatttt gttgttaatc aaatttatgt taattggatc ctttaaattt tttttggcat 4920 tttccaacaa aaatggcttt attcataaga aaggaaaaaa atcaatggaa tttgatatct 4980 aaagaagtta gaaagggagc aaaataaaaa acataaagga gatagatgaa ttagtaagca 5040 aatcagtagt cgagtttttc aaactggcaa aattaattaa ttgactttta gcccaaattt 5100 acattgttaa ttaaatcaag aaggaagaag atctaagagc tcccattgat aggcaagcct 5160 agagagaact agctaaattt atcatgctag gatattgaaa cacagaaagt ttacatacat 5220 ttatgaaggg tcaatttagt ttggacagtg aggtatttgt cttagtggaa aaaaggagaa 5280 ttagtctgat caaatcgtga agtaatacag tgaacttgca ggtgcacaag tacctcttgg 5340 acttctgaat tgatccagtt gtcatccacc acagacatct cacatcagat acagacagtt 5400 ccaagattga caacagagaa caacctgctg gaaagacctg ggcagaaatg gagagccctg 5460 cgggaaccat gctacatttt catctaaaga gagaatgcac atctgatgag actgaaagtt 5520 ctttgttgtt ttagattgta gaatggtatt gaattggtct gtggaaaatt gcattgcttt 5580 tatttctttg tgtaatcaag tttaagtaat aggggatata taatcataag cattttaggg 5640 tgggagggac tattaagtaa ttttaagtgg gtggggttat ttagaatgtt agaataatat 5700 tatgtattag atatcgctat aagtggacat gcgtacttac ttgtaaccct ttaccctata 5760 attgctatcc ttaaagattt caaataaact cggagggaac tgcagggaga ccaacttatt 5820 tagagcgaat tggacatgga taaaaacccc agtgggagaa agttcaaagg tgattagatt 5880 aataatttaa tagaggatga gtgacctctg ataaattact gctagaatga acttgtcaat 5940 gatggatggt aaattttcat ggaagttata aaagtgataa ataaaaaccc ttgcttttac 6000 ccctgtcagt agccctcctc ctaccactga accccattgc ccctacccct ccttctaact 6060 ttattgctgt attctcttca ctctatattt ctctctattt gctaatattg cattgctgtt 6120 acaataaaaa ttcaataaag atttagtggt taagtgcaaa 6160 12 2629 DNA Homo sapiens 12 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaag 2580 agaagctgat acaaacgcag gaaatgctga tttctttatg gagggggag 2629 13 6521 DNA Homo sapiens 13 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttacttttaa aatggtgcat ttgtgcttct gaactatttt 4800 gaagagtcac ttctgtttac ctcaagtatc aattcatcct ccatacattt gaattcaagt 4860 tgttttttgt caaatttaca gttgtcaatt gatcttcaag ctgcagggtg cctagaaatg 4920 ggccgttgtc tgtagccctg gcatgtgcac acggacattt gccaccactg caagcaaaag 4980 tctggagaag ttcaccaacg acaagaacga ttagggaaaa tatgctgctg tgggttaaca 5040 actcagaaag tccctgatcc acatttggct gtttactaaa gcttgtgatt aactttttgg 5100 cagtgtgtac tatgctctat tgctatatat gctatctata aatgtagatg ttaaggataa 5160 gtaattctaa atttattatt ctatagtttt gaagtttggt taagtttcct ttcactcaat 5220 tgatttattt tgttgttaat caaatttatg ttaattggat cctttaaatt ttttttggca 5280 ttttccaaca aaaatggctt tattcataag aaaggaaaaa aatcaatgga atttgatatc 5340 taaagaagtt agaaagggag caaaataaaa aacataaagg agatagatga attagtaagc 5400 aaatcagtag tcgagttttt caaactggca aaattaatta attgactttt agcccaaatt 5460 tacattgtta attaaatcaa gaaggaagaa gatctaagag ctcccattga taggcaagcc 5520 tagagagaac tagctaaatt tatcatgcta ggatattgaa acacagaaag tttacataca 5580 tttatgaagg gtcaatttag tttggacagt gaggtatttg tcttagtgga aaaaaggaga 5640 attagtctga tcaaatcgtg aagtaataca gtgaacttgc aggtgcacaa gtacctcttg 5700 gacttctgaa ttgatccagt tgtcatccac cacagacatc tcacatcaga tacagacagt 5760 tccaagattg acaacagaga acaacctgct ggaaagacct gggcagaaat ggagagccct 5820 gcgggaacca tgctacattt tcatctaaag agagaatgca catctgatga gactgaaagt 5880 tctttgttgt tttagattgt agaatggtat tgaattggtc tgtggaaaat tgcattgctt 5940 ttatttcttt gtgtaatcaa gtttaagtaa taggggatat ataatcataa gcattttagg 6000 gtgggaggga ctattaagta attttaagtg ggtggggtta tttagaatgt tagaataata 6060 ttatgtatta gatatcgcta taagtggaca tgcgtactta cttgtaaccc tttaccctat 6120 aattgctatc cttaaagatt tcaaataaac tcggagggaa ctgcagggag accaacttat 6180 ttagagcgaa ttggacatgg ataaaaaccc cagtgggaga aagttcaaag gtgattagat 6240 taataattta atagaggatg agtgacctct gataaattac tgctagaatg aacttgtcaa 6300 tgatggatgg taaattttca tggaagttat aaaagtgata aataaaaacc cttgctttta 6360 cccctgtcag tagccctcct cctaccactg aaccccattg cccctacccc tccttctaac 6420 tttattgctg tattctcttc actctatatt tctctctatt tgctaatatt gcattgctgt 6480 tacaataaaa attcaataaa gatttagtgg ttaagtgcaa a 6521 14 6106 DNA Homo sapiens 14 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttc tcaagtatca attcatcctc catacatttg aattcaagtt 4500 gttttttgtc aaatttacag ttgtcaattg atcttcaagc tgcagggtgc ctagaaatgg 4560 gccgttgtct gtagccctgg catgtgcaca cggacatttg ccaccactgc aagcaaaagt 4620 ctggagaagt tcaccaacga caagaacgat tagggaaaat atgctgctgt gggttaacaa 4680 ctcagaaagt ccctgatcca catttggctg tttactaaag cttgtgatta actttttggc 4740 agtgtgtact atgctctatt gctatatatg ctatctataa atgtagatgt taaggataag 4800 taattctaaa tttattattc tatagttttg aagtttggtt aagtttcctt tcactcaatt 4860 gatttatttt gttgttaatc aaatttatgt taattggatc ctttaaattt tttttggcat 4920 tttccaacaa aaatggcttt attcataaga aaggaaaaaa atcaatggaa tttgatatct 4980 aaagaagtta gaaagggagc aaaataaaaa acataaagga gatagatgaa ttagtaagca 5040 aatcagtagt cgagtttttc aaactggcaa aattaattaa ttgactttta gcccaaattt 5100 acattgttaa ttaaatcaag aaggaagaag atctaagagc tcccattgat aggcaagcct 5160 agagagaact agctaaattt atcatgctag gatattgaaa cacagaaagt ttacatacat 5220 ttatgaaggg tcaatttagt ttggacagtg aggtatttgt cttagtggaa aaaaggagaa 5280 ttagtctgat caaatcgtga agtaatacag tgaacttgca ggtgcacaag tacctcttgg 5340 acttctgaat tgatccagtt gtcatccacc acagacatct cacatcagat acagacagtt 5400 ccaagattga caacagagaa caacctgctg gaaagacctg ggcagaaatg gagagccctg 5460 cgggaaccat gctacatttt catctaaaga gagaatgcac atctgatgag actgaaagtt 5520 ctttgttgtt ttagattgta gaatggtatt gaattggtct gtggaaaatt gcattgcttt 5580 tatttctttg tgtaatcaag tttaagtaat aggggatata taatcataag cattttaggg 5640 tgggagggac tattaagtaa ttttaagtgg gtggggttat ttagaatgtt agaataatat 5700 tatgtattag atatcgctat aagtggacat gcgtacttac ttgtaaccct ttaccctata 5760 attgctatcc ttaaagattt caaataaact cggagggaac tgcagggaga ccaacttatt 5820 tagagcgaat tggacatgga taaaaacccc agtgggagaa agttcaaagg tgattagatt 5880 aataatttaa tagaggatga gtgacctctg ataaattact gctagaatga acttgtcaat 5940 gatggatggt aaattttcat ggaagttata aaagtgataa atatgcccct acccctcctt 6000 ctaactttat tgctgtattc tcttcactct atatttctct ctatttgcta atattgcatt 6060 gctgttacaa taaaaattca ataaagattt agtggttaag tgcaaa 6106 15 6267 DNA Homo sapiens 15 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttt aaaatggtgc atttgtgctt ctgaactatt ttgaagagtc 4500 acttctgttt acctcaagta tcaattcatc ctccatacat ttgaattcaa gttgtttttt 4560 gtcaaattta cagttgtcaa ttgatcttca agctgcaggg tgcctagaaa tgggccgttg 4620 tctgtagccc tggcatgtgc acacggacat ttgccaccac tgcaagcaaa agtctggaga 4680 agttcaccaa cgacaagaac gattagggaa aatatgctgc tgtgggttaa caactcagaa 4740 agtccctgat ccacatttgg ctgtttacta aagcttgtga ttaacttttt ggcagtgtgt 4800 actatgctct attgctatat atgctatcta taaatgtaga tgttaaggat aagtaattct 4860 aaatttatta ttctatagtt ttgaagtttg gttaagtttc ctttcactca attgatttat 4920 tttgttgtta atcaaattta tgttaattgg atcctttaaa ttttttttgg cattttccaa 4980 caaaaatggc tttattcata agaaaggaaa aaaatcaatg gaatttgata tctaaagaag 5040 ttagaaaggg agcaaaataa aaaacataaa ggagatagat gaattagtaa gcaaatcagt 5100 agtcgagttt ttcaaactgg caaaattaat taattgactt ttagcccaaa tttacattgt 5160 taattaaatc aagaaggaag aagatctaag agctcccatt gataggcaag cctagagaga 5220 actagctaaa tttatcatgc taggatattg aaacacagaa agtttacata catttatgaa 5280 gggtcaattt agtttggaca gtgaggtatt tgtcttagtg gaaaaaagga gaattagtct 5340 gatcaaatcg tgaagtaata cagtgaactt gcaggtgcac aaaataagag ggccacatct 5400 atatggtgca gtctggaatt ctgtttaagt ttgtaggtac ctcttggact tctgaattga 5460 tccagttgtc atccaccaca gacatctcac atcagataca gacagttcca agattgacaa 5520 cagagaacaa cctgctggaa agacctgggc agaaatggag agccctgcgg gaaccatgct 5580 acattttcat ctaaagagag aatgcacatc tgatgagact gaaagttctt tgttgtttta 5640 gattgtagaa tggtattgaa ttggtctgtg gaaaattgca ttgcttttat ttctttgtgt 5700 aatcaagttt aagtaatagg ggatatataa tcataagcat tttagggtgg gagggactat 5760 taagtaattt taagtgggtg gggttattta gaatgttaga ataatattat gtattagata 5820 tcgctataag tggacatgcg tacttacttg taacccttta ccctataatt gctatcctta 5880 aagatttcaa ataaactcgg agggaactgc agggagacca acttatttag agcgaattgg 5940 acatggataa aaaccccagt gggagaaagt tcaaaggtga ttagattaat aatttaatag 6000 aggatgagtg acctctgata aattactgct agaatgaact tgtcaatgat ggatggtaaa 6060 ttttcatgga agttataaaa gtgataaata aaaacccttg cttttacccc tgtcagtagc 6120 cctcctccta ccactgaacc ccattgcccc tacccctcct tctaacttta ttgctgtatt 6180 ctcttcactc tatatttctc tctatttgct aatattgcat tgctgttaca ataaaaattc 6240 aataaagatt tagtggttaa gtgcaaa 6267 16 6467 DNA Homo sapiens 16 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttacttttaa aatggtgcat ttgtgcttct gaactatttt 4800 gaagagtcac ttctgtttac ctcaagtatc aattcatcct ccatacattt gaattcaagt 4860 tgttttttgt caaatttaca gttgtcaatt gatcttcaag ctgcagggtg cctagaaatg 4920 ggccgttgtc tgtagccctg gcatgtgcac acggacattt gccaccactg caagcaaaag 4980 tctggagaag ttcaccaacg acaagaacga ttagggaaaa tatgctgctg tgggttaaca 5040 actcagaaag tccctgatcc acatttggct gtttactaaa gcttgtgatt aactttttgg 5100 cagtgtgtac tatgctctat tgctatatat gctatctata aatgtagatg ttaaggataa 5160 gtaattctaa atttattatt ctatagtttt gaagtttggt taagtttcct ttcactcaat 5220 tgatttattt tgttgttaat caaatttatg ttaattggat cctttaaatt ttttttggca 5280 ttttccaaca aaaatggctt tattcataag aaaggaaaaa aatcaatgga atttgatatc 5340 taaagaagtt agaaagggag caaaataaaa aacataaagg agatagatga attagtaagc 5400 aaatcagtag tcgagttttt caaactggca aaattaatta attgactttt agcccaaatt 5460 tacattgtta attaaatcaa gaaggaagaa gatctaagag ctcccattga taggcaagcc 5520 tagagagaac tagctaaatt tatcatgcta ggatattgaa acacagaaag tttacataca 5580 tttatgaagg gtcaatttag tttggacagt gaggtatttg tcttagtgga aaaaaggaga 5640 attagtctga tcaaatcgtg aagtaataca gtgaacttgc aggtgcacaa gtacctcttg 5700 gacttctgaa ttgatccagt tgtcatccac cacagacatc tcacatcaga tacagacagt 5760 tccaagattg acaacagaga acaacctgct ggaaagacct gggcagaaat ggagagccct 5820 gcgggaacca tgctacattt tcatctaaag agagaatgca catctgatga gactgaaagt 5880 tctttgttgt tttagattgt agaatggtat tgaattggtc tgtggaaaat tgcattgctt 5940 ttatttcttt gtgtaatcaa gtttaagtaa taggggatat ataatcataa gcattttagg 6000 gtgggaggga ctattaagta attttaagtg ggtggggtta tttagaatgt tagaataata 6060 ttatgtatta gatatcgcta taagtggaca tgcgtactta cttgtaaccc tttaccctat 6120 aattgctatc cttaaagatt tcaaataaac tcggagggaa ctgcagggag accaacttat 6180 ttagagcgaa ttggacatgg ataaaaaccc cagtgggaga aagttcaaag gtgattagat 6240 taataattta atagaggatg agtgacctct gataaattac tgctagaatg aacttgtcaa 6300 tgatggatgg taaattttca tggaagttat aaaagtgata aatatgcccc tacccctcct 6360 tctaacttta ttgctgtatt ctcttcactc tatatttctc tctatttgct aatattgcat 6420 tgctgttaca ataaaaattc aataaagatt tagtggttaa gtgcaaa 6467 17 5112 DNA Homo sapiens 17 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttc tcaagtatca attcatcctc catacatttg aattcaagtt 4500 gttttttgtc aaatttacag ttgtcaattg atcttcaagc tgcagggtgc ctagaaatgg 4560 gccgttgtct gtagccctgg catgtgcaca cggacatttg ccaccactgc aagcaaaagt 4620 ctggagaagt tcaccaacga caagaacgat tagggaaaat atgctgctgt gggttaacaa 4680 ctcagaaagt ccctgatcca catttggctg tttactaaag cttgtgatta actttttggc 4740 agtgtgtact atgctctatt gctatatatg ctatctataa atgtagatgt taaggataag 4800 taattctaaa tttattattc tatagttttg aagtttggtt aagtttcctt tcactcaatt 4860 gatttatttt gttgttaatc aaatttatgt taattggatc ctttaaattt tttttggcat 4920 tttccaacaa aaatggcttt attcataaga aaggaaaaaa atcaatggaa tttgatatct 4980 aaagaagtta gaaagggagc aaaataaaaa acataaagga gatagatgaa ttagtaagca 5040 aatcagtagt cgagtttttc aaactggcaa aattaattaa ttgactttta gcccaaattt 5100 acattgttaa tt 5112 18 6213 DNA Homo sapiens 18 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttt aaaatggtgc atttgtgctt ctgaactatt ttgaagagtc 4500 acttctgttt acctcaagta tcaattcatc ctccatacat ttgaattcaa gttgtttttt 4560 gtcaaattta cagttgtcaa ttgatcttca agctgcaggg tgcctagaaa tgggccgttg 4620 tctgtagccc tggcatgtgc acacggacat ttgccaccac tgcaagcaaa agtctggaga 4680 agttcaccaa cgacaagaac gattagggaa aatatgctgc tgtgggttaa caactcagaa 4740 agtccctgat ccacatttgg ctgtttacta aagcttgtga ttaacttttt ggcagtgtgt 4800 actatgctct attgctatat atgctatcta taaatgtaga tgttaaggat aagtaattct 4860 aaatttatta ttctatagtt ttgaagtttg gttaagtttc ctttcactca attgatttat 4920 tttgttgtta atcaaattta tgttaattgg atcctttaaa ttttttttgg cattttccaa 4980 caaaaatggc tttattcata agaaaggaaa aaaatcaatg gaatttgata tctaaagaag 5040 ttagaaaggg agcaaaataa aaaacataaa ggagatagat gaattagtaa gcaaatcagt 5100 agtcgagttt ttcaaactgg caaaattaat taattgactt ttagcccaaa tttacattgt 5160 taattaaatc aagaaggaag aagatctaag agctcccatt gataggcaag cctagagaga 5220 actagctaaa tttatcatgc taggatattg aaacacagaa agtttacata catttatgaa 5280 gggtcaattt agtttggaca gtgaggtatt tgtcttagtg gaaaaaagga gaattagtct 5340 gatcaaatcg tgaagtaata cagtgaactt gcaggtgcac aaaataagag ggccacatct 5400 atatggtgca gtctggaatt ctgtttaagt ttgtaggtac ctcttggact tctgaattga 5460 tccagttgtc atccaccaca gacatctcac atcagataca gacagttcca agattgacaa 5520 cagagaacaa cctgctggaa agacctgggc agaaatggag agccctgcgg gaaccatgct 5580 acattttcat ctaaagagag aatgcacatc tgatgagact gaaagttctt tgttgtttta 5640 gattgtagaa tggtattgaa ttggtctgtg gaaaattgca ttgcttttat ttctttgtgt 5700 aatcaagttt aagtaatagg ggatatataa tcataagcat tttagggtgg gagggactat 5760 taagtaattt taagtgggtg gggttattta gaatgttaga ataatattat gtattagata 5820 tcgctataag tggacatgcg tacttacttg taacccttta ccctataatt gctatcctta 5880 aagatttcaa ataaactcgg agggaactgc agggagacca acttatttag agcgaattgg 5940 acatggataa aaaccccagt gggagaaagt tcaaaggtga ttagattaat aatttaatag 6000 aggatgagtg acctctgata aattactgct agaatgaact tgtcaatgat ggatggtaaa 6060 ttttcatgga agttataaaa gtgataaata tgcccctacc cctccttcta actttattgc 6120 tgtattctct tcactctata tttctctcta tttgctaata ttgcattgct gttacaataa 6180 aaattcaata aagatttagt ggttaagtgc aaa 6213 19 6522 DNA Homo sapiens 19 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttactttctc aagtatcaat tcatcctcca tacatttgaa 4800 ttcaagttgt tttttgtcaa atttacagtt gtcaattgat cttcaagctg cagggtgcct 4860 agaaatgggc cgttgtctgt agccctggca tgtgcacacg gacatttgcc accactgcaa 4920 gcaaaagtct ggagaagttc accaacgaca agaacgatta gggaaaatat gctgctgtgg 4980 gttaacaact cagaaagtcc ctgatccaca tttggctgtt tactaaagct tgtgattaac 5040 tttttggcag tgtgtactat gctctattgc tatatatgct atctataaat gtagatgtta 5100 aggataagta attctaaatt tattattcta tagttttgaa gtttggttaa gtttcctttc 5160 actcaattga tttattttgt tgttaatcaa atttatgtta attggatcct ttaaattttt 5220 tttggcattt tccaacaaaa atggctttat tcataagaaa ggaaaaaaat caatggaatt 5280 tgatatctaa agaagttaga aagggagcaa aataaaaaac ataaaggaga tagatgaatt 5340 agtaagcaaa tcagtagtcg agtttttcaa actggcaaaa ttaattaatt gacttttagc 5400 ccaaatttac attgttaatt aaatcaagaa ggaagaagat ctaagagctc ccattgatag 5460 gcaagcctag agagaactag ctaaatttat catgctagga tattgaaaca cagaaagttt 5520 acatacattt atgaagggtc aatttagttt ggacagtgag gtatttgtct tagtggaaaa 5580 aaggagaatt agtctgatca aatcgtgaag taatacagtg aacttgcagg tgcacaaaat 5640 aagagggcca catctatatg gtgcagtctg gaattctgtt taagtttgta ggtacctctt 5700 ggacttctga attgatccag ttgtcatcca ccacagacat ctcacatcag atacagacag 5760 ttccaagatt gacaacagag aacaacctgc tggaaagacc tgggcagaaa tggagagccc 5820 tgcgggaacc atgctacatt ttcatctaaa gagagaatgc acatctgatg agactgaaag 5880 ttctttgttg ttttagattg tagaatggta ttgaattggt ctgtggaaaa ttgcattgct 5940 tttatttctt tgtgtaatca agtttaagta ataggggata tataatcata agcattttag 6000 ggtgggaggg actattaagt aattttaagt gggtggggtt atttagaatg ttagaataat 6060 attatgtatt agatatcgct ataagtggac atgcgtactt acttgtaacc ctttacccta 6120 taattgctat ccttaaagat ttcaaataaa ctcggaggga actgcaggga gaccaactta 6180 tttagagcga attggacatg gataaaaacc ccagtgggag aaagttcaaa ggtgattaga 6240 ttaataattt aatagaggat gagtgacctc tgataaatta ctgctagaat gaacttgtca 6300 atgatggatg gtaaattttc atggaagtta taaaagtgat aaataaaaac ccttgctttt 6360 acccctgtca gtagccctcc tcctaccact gaaccccatt gcccctaccc ctccttctaa 6420 ctttattgct gtattctctt cactctatat ttctctctat ttgctaatat tgcattgctg 6480 ttacaataaa aattcaataa agatttagtg gttaagtgca aa 6522 20 4007 DNA Homo sapiens 20 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcct cttcatgtca ggcaatttga gatatacacc atgttgtttt 3900 ctactgctca aactgtgctt tggcaattaa agcctgcctc tkacatctct gtgtgtggca 3960 gcaaaatgga gtctcagcaa ccagcaggaa aatarcagga ccgagct 4007 21 6213 DNA Homo sapiens 21 atgaagatgt tactcctgct gcattgcctt ggggtgtttc tgtcctgttc tggacacatc 60 caggatgagc acccccaata tcacagccct ccggatgtgg tgattcctgt gaggataact 120 ggcaccacca gaggcatgac acctccaggc tggctctcct atatcctgcc ctttggaggc 180 cagaaacaca ttatccacat aaaggtcaag aagcttttgt tttccaaaca cctccctgtg 240 ttcacctaca cagaccaggg tgctatcctt gaggaccagc catttgtcca gaataactgc 300 tactatcatg gttatgtgga aggggaccca gaatccctgg tttccctcag tacctgtttt 360 gggggttttc aaggaatatt acagataaat gactttgctt atgaaatcaa gcccctagca 420 ttttctacca cgtttgaaca tctggtatac aagatggaca gtgaggagaa acaattttca 480 accatgagat ccggatttat gcaaaatgaa ataacatgcc gaatggaatt tgaagaaatt 540 gataattcca ctcagaagca aagttcttat gtgggctggt ggatccattt taggattgtt 600 gaaattgtag tcgtcattga taattatctg tacattcgtt atgaaaggaa cgactcaaag 660 ttgctggagg atctatatgt tattgttaat atagtggatt ccattttgga tgtcattggt 720 gttaaggtgt tattatttgg tttggagatc tggaccaata aaaacctcat tgtagtagat 780 gatgtaagga aatctgtgca cctgtattgc aagtggaagt cggagaacat tacgccccgg 840 atgcaacatg acacctcaca tcttttcaca actctaggat taagagggtt aagtggcata 900 ggagctttta gaggaatgtg tacaccacac cgtagttgtg caattgttac tttcatgaac 960 aaaactttgg gcactttttc aattgcagtg gctcatcatc taggtcataa tttgggcatg 1020 aaccatgatg aggatacatg tcgttgttca caacctagat gcataatgca tgaaggcaac 1080 ccaccaataa ctaaatttag caattgtagt tatggtgatt tttgggaata tactgtagag 1140 aggacaaagt gtttgcttga aacagtacac acaaaggaca tctttaatgt gaagcgctgt 1200 gggaatggtg ttgttgaaga aggagaagag tgtgactgtg gacctttaaa gcattgtgca 1260 aaagatccct gctgtctgtc aaattgcact ctgactgatg gttctacttg tgcttttggg 1320 ctttgttgca aagactgcaa gttcctacca tcagggaaag tgtgtagaaa ggaggtcaat 1380 gaatgtgatc ttccagagtg gtgcaatggt acttcccata agtgcccaga tgacttttat 1440 gtggaagatg gaattccctg taaggagagg ggctactgct atgaaaagag ctgtcatgac 1500 cgcaatgaac agtgtaggag gatttttggt gcaggcgcaa atactgcaag tgagacttgc 1560 tacaaagaat tgaacacctt aggtgaccgt gttggtcact gtggtatcaa aaatgctaca 1620 tatataaagt gtaatatctc agatgtccag tgtggaagaa ttcagtgtga gaatgtgaca 1680 gaaattccca atatgagtga tcatactact gtgcattggg ctcgcttcaa tgacataatg 1740 tgctggagta ctgattacca tttggggatg aagggacctg atattggtga agtgaaagat 1800 ggaacagagt gtgggataga tcatatatgc atccacaggc actgtgtcca tataaccatc 1860 ttgaatagta attgctcacc tgcattttgt aacaagaggg gcatctgcaa caataaacat 1920 cactgccatt gcaattatct gtgggaccct cccaactgcc tgataaaagg ctatggaggt 1980 agtgttgaca gtggtccacc ccctaagaga aagaagaaaa agaagttctg ttatctgtgt 2040 atattgttgc ttattgtttt gtttatttta ttatgttgtc tttatcgact ttgtaaaaaa 2100 agtaaaccaa taaaaaagca gcaagatgtt caaactccat ctgcaaaaga agaggaaaaa 2160 attcagcgtc gacctcatga gttacctccc cagagtcaac cttgggtgat gccttcccag 2220 agtcaacctc ctgtgacgcc ttcccagagt catcctcagg tgatgccttc ccagagtcaa 2280 cctcctcaaa atttattcct gttcagcttc tcaatcagtg actgtgtgct aaattttagg 2340 ctactgtatc ttcaggccac ctgaggcaca tcctctctga aacggctatg gaaggttagg 2400 gccactctgg actggcacac atcctaaagc accaaaagac cttcaacatt ttctgagagc 2460 aacagagtat ttgccaataa atgatctctc atttttccac cttgactgcc aatctaacta 2520 aaataattaa taagtttact ttccagccag tcctggaagt ctgggtttta cctgccaaaa 2580 cctccatcac catctaaatt ataggctgcc aaatttgctg tttaacattt acagagaagc 2640 tgatacaaac gcaggaaatg ctgatttctt tatggagggg gagacgagga ggaggaggac 2700 atgacttttc ttgcggtttc ggtaccctct ttttaaatca ctggaggact gaggccttat 2760 taaggaagcc aaaattatcg gtgcagtgtg gaaaggcttc cgtgatcctc tcgctgcacc 2820 cttagaaact tcaccgtctt caaactccat ttccatggtt ctgttaattc tcaaggagca 2880 gcaactcgac tggttctccc aggagcagga aaaacccttg tgacatgaaa catctcaggc 2940 ctgaaaagaa agtgctctct cagatggact cttgcatgtt aagactatgt cttcacatca 3000 tggtgcaaat cacatgtacc caatgactcc ggctttgaca caacacctta ccatcatcat 3060 gccatgatgg cttccacaaa gcattaaacc tggtaaccag agattactgg tggctccagc 3120 gttgttagat gttcatgaaa tgtgaccacc tctcaatcac ctttgagggc taaagagtag 3180 cacatcaaaa ggactccaaa atcccatacc caactcttaa gagatttgtc ctggtacttc 3240 agaaagaatt ttcatgagtg ttcttaattg gctggaaaag caccagctga cgttttggaa 3300 gaatctatcc atgtgtctgc ctccatatgc atctgggcat ttcatcttca gtcccctcat 3360 tagactgtag cattaggatg tgtggagaga ggagaaatga tttagcaccc agattcacac 3420 tcctatgcct ggaaggggga catctttgaa gaagaggaat tagggctgtg gacactgtct 3480 tgaggatgtg gacttcctta gtgagctcca cattacttga tggtaaccac ttcaaaagga 3540 tcagaatcca cgtaatgaaa aaggtccctc tagaggatgg agctgatgtg aagctgccaa 3600 tggatgaaaa gcctcagaaa gcaactcaaa ggactcaaag caacggacaa cacaagagtt 3660 gtcttcagcc cagtgacacc tctgatgtcc cctggaagct ttgtgctaac ctgggactgc 3720 ctgacttcct ttagcctggt cccttgctac taccttgaac tgttttatct aacctctctt 3780 tttctgttta attctttgct actgccattg accctgctgc aggatttgtg tcattttcct 3840 gcctggttgc tgagactcca ttttgctgcc acacacagag atgtaagagg caggctttaa 3900 ttgccaaagc acagtttgag cagtagaaaa caacatggtg tatatctcaa attgcctgac 3960 atgaagagga gtctaacggt gaagtttcac ttttcatcag catcatcttt cacatgttca 4020 ttatcatccg ctcttattct tgcatgttta aacacttaaa atttttagta taatttttag 4080 tgtgttttga agtggtgact aggctttcaa aaacttccat tgaattacaa agcactatcc 4140 agttcttatt gttaaactaa gtaaaaatga taagtaacat agtgtaaaat attcctttac 4200 tgtgaacttc ttacaatgct gtgaatgaga ggctcctcag aactggagca tttgtataat 4260 aattcatcct gttcatcttc aattttaaca tcatatataa tttcaattct atcaattggg 4320 cctttaaaaa tcatataaaa ggatataaaa tttgaaaaga gaaacctaat tggctattta 4380 atccaaaaca actttttttt ttccttcaat ggaatcagaa agcttgtcaa tcactcatgt 4440 gttttagagt aattactttt aaaatggtgc atttgtgctt ctgaactatt ttgaagagtc 4500 acttctgttt acctcaagta tcaattcatc ctccatacat ttgaattcaa gttgtttttt 4560 gtcaaattta cagttgtcaa ttgatcttca agctgcaggg tgcctagaaa tgggccgttg 4620 tctgtagccc tggcatgtgc acacggacat ttgccaccac tgcaagcaaa agtctggaga 4680 agttcaccaa cgacaagaac gattagggaa aatatgctgc tgtgggttaa caactcagaa 4740 agtccctgat ccacatttgg ctgtttacta aagcttgtga ttaacttttt ggcagtgtgt 4800 actatgctct attgctatat atgctatcta taaatgtaga tgttaaggat aagtaattct 4860 aaatttatta ttctatagtt ttgaagtttg gttaagtttc ctttcactca attgatttat 4920 tttgttgtta atcaaattta tgttaattgg atcctttaaa ttttttttgg cattttccaa 4980 caaaaatggc tttattcata agaaaggaaa aaaatcaatg gaatttgata tctaaagaag 5040 ttagaaaggg agcaaaataa aaaacataaa ggagatagat gaattagtaa gcaaatcagt 5100 agtcgagttt ttcaaactgg caaaattaat taattgactt ttagcccaaa tttacattgt 5160 taattaaatc aagaaggaag aagatctaag agctcccatt gataggcaag cctagagaga 5220 actagctaaa tttatcatgc taggatattg aaacacagaa agtttacata catttatgaa 5280 gggtcaattt agtttggaca gtgaggtatt tgtcttagtg gaaaaaagga gaattagtct 5340 gatcaaatcg tgaagtaata cagtgaactt gcaggtgcac aagtacctct tggacttctg 5400 aattgatcca gttgtcatcc accacagaca tctcacatca gatacagaca gttccaagat 5460 tgacaacaga gaacaacctg ctggaaagac ctgggcagaa atggagagcc ctgcgggaac 5520 catgctacat tttcatctaa agagagaatg cacatctgat gagactgaaa gttctttgtt 5580 gttttagatt gtagaatggt attgaattgg tctgtggaaa attgcattgc ttttatttct 5640 ttgtgtaatc aagtttaagt aataggggat atataatcat aagcatttta gggtgggagg 5700 gactattaag taattttaag tgggtggggt tatttagaat gttagaataa tattatgtat 5760 tagatatcgc tataagtgga catgcgtact tacttgtaac cctttaccct ataattgcta 5820 tccttaaaga tttcaaataa actcggaggg aactgcaggg agaccaactt atttagagcg 5880 aattggacat ggataaaaac cccagtggga gaaagttcaa aggtgattag attaataatt 5940 taatagagga tgagtgacct ctgataaatt actgctagaa tgaacttgtc aatgatggat 6000 ggtaaatttt catggaagtt ataaaagtga taaataaaaa cccttgcttt tacccctgtc 6060 agtagccctc ctcctaccac tgaaccccat tgcccctacc cctccttcta actttattgc 6120 tgtattctct tcactctata tttctctcta tttgctaata ttgcattgct gttacaataa 6180 aaattcaata aagatttagt ggttaagtgc aaa 6213 22 6468 DNA Homo sapiens 22 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttactttctc aagtatcaat tcatcctcca tacatttgaa 4800 ttcaagttgt tttttgtcaa atttacagtt gtcaattgat cttcaagctg cagggtgcct 4860 agaaatgggc cgttgtctgt agccctggca tgtgcacacg gacatttgcc accactgcaa 4920 gcaaaagtct ggagaagttc accaacgaca agaacgatta gggaaaatat gctgctgtgg 4980 gttaacaact cagaaagtcc ctgatccaca tttggctgtt tactaaagct tgtgattaac 5040 tttttggcag tgtgtactat gctctattgc tatatatgct atctataaat gtagatgtta 5100 aggataagta attctaaatt tattattcta tagttttgaa gtttggttaa gtttcctttc 5160 actcaattga tttattttgt tgttaatcaa atttatgtta attggatcct ttaaattttt 5220 tttggcattt tccaacaaaa atggctttat tcataagaaa ggaaaaaaat caatggaatt 5280 tgatatctaa agaagttaga aagggagcaa aataaaaaac ataaaggaga tagatgaatt 5340 agtaagcaaa tcagtagtcg agtttttcaa actggcaaaa ttaattaatt gacttttagc 5400 ccaaatttac attgttaatt aaatcaagaa ggaagaagat ctaagagctc ccattgatag 5460 gcaagcctag agagaactag ctaaatttat catgctagga tattgaaaca cagaaagttt 5520 acatacattt atgaagggtc aatttagttt ggacagtgag gtatttgtct tagtggaaaa 5580 aaggagaatt agtctgatca aatcgtgaag taatacagtg aacttgcagg tgcacaaaat 5640 aagagggcca catctatatg gtgcagtctg gaattctgtt taagtttgta ggtacctctt 5700 ggacttctga attgatccag ttgtcatcca ccacagacat ctcacatcag atacagacag 5760 ttccaagatt gacaacagag aacaacctgc tggaaagacc tgggcagaaa tggagagccc 5820 tgcgggaacc atgctacatt ttcatctaaa gagagaatgc acatctgatg agactgaaag 5880 ttctttgttg ttttagattg tagaatggta ttgaattggt ctgtggaaaa ttgcattgct 5940 tttatttctt tgtgtaatca agtttaagta ataggggata tataatcata agcattttag 6000 ggtgggaggg actattaagt aattttaagt gggtggggtt atttagaatg ttagaataat 6060 attatgtatt agatatcgct ataagtggac atgcgtactt acttgtaacc ctttacccta 6120 taattgctat ccttaaagat ttcaaataaa ctcggaggga actgcaggga gaccaactta 6180 tttagagcga attggacatg gataaaaacc ccagtgggag aaagttcaaa ggtgattaga 6240 ttaataattt aatagaggat gagtgacctc tgataaatta ctgctagaat gaacttgtca 6300 atgatggatg gtaaattttc atggaagtta taaaagtgat aaatatgccc ctacccctcc 6360 ttctaacttt attgctgtat tctcttcact ctatatttct ctctatttgc taatattgca 6420 ttgctgttac aataaaaatt caataaagat ttagtggtta agtgcaaa 6468 23 2917 DNA Homo sapiens 23 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt tgtcttagtg gaaaaaagga 1980 gaattagtct gatcaaatcg tgaagtaata cagtgaactt gcaggtgcac aaaataagag 2040 ggccacatct atatggtgca gtctggaatt ctgtttaagt ttgtaggtac ctcttggact 2100 tctgaattga tccagttgtc atccaccaca gacatctcac atcagataca gacagttcca 2160 agattgacaa cagagaacaa cctgctggaa agacctgggc agaaatggag agccctgcgg 2220 gaaccatgct acattttcat ctaaagagag aatgcacatc tgatgagact gaaagttctt 2280 tgttgtttta gattgtagaa tggtattgaa ttggtctgtg gaaaattgca ttgcttttat 2340 ttctttgtgt aatcaagttt aagtaatagg ggatatataa tcataagcat tttagggtgg 2400 gagggactat taagtaattt taagtgggtg gggttattta gaatgttaga ataatattat 2460 gtattagata tcgctataag tggacatgcg tacttacttg taacccttta ccctataatt 2520 gctatcctta aagatttcaa ataaactcgg agggaactgc agggagacca acttatttag 2580 agcgaattgg acatggataa aaaccccagt gggagaaagt tcaaaggtga ttagattaat 2640 aatttaatag aggatgagtg acctctgata aattactgct agaatgaact tgtcaatgat 2700 ggatggtaaa ttttcatgga agttataaaa gtgataaata aaaacccttg cttttacccc 2760 tgtcagtagc cctcctccta ccactgaacc ccattgcccc tacccctcct tctaacttta 2820 ttgctgtatt ctcttcactc tatatttctc tctatttgct aatattgcat tgctgttaca 2880 ataaaaattc aataaagatt tagtggttaa gtgcaaa 2917 24 6468 DNA Homo sapiens 24 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttactttctc aagtatcaat tcatcctcca tacatttgaa 4800 ttcaagttgt tttttgtcaa atttacagtt gtcaattgat cttcaagctg cagggtgcct 4860 agaaatgggc cgttgtctgt agccctggca tgtgcacacg gacatttgcc accactgcaa 4920 gcaaaagtct ggagaagttc accaacgaca agaacgatta gggaaaatat gctgctgtgg 4980 gttaacaact cagaaagtcc ctgatccaca tttggctgtt tactaaagct tgtgattaac 5040 tttttggcag tgtgtactat gctctattgc tatatatgct atctataaat gtagatgtta 5100 aggataagta attctaaatt tattattcta tagttttgaa gtttggttaa gtttcctttc 5160 actcaattga tttattttgt tgttaatcaa atttatgtta attggatcct ttaaattttt 5220 tttggcattt tccaacaaaa atggctttat tcataagaaa ggaaaaaaat caatggaatt 5280 tgatatctaa agaagttaga aagggagcaa aataaaaaac ataaaggaga tagatgaatt 5340 agtaagcaaa tcagtagtcg agtttttcaa actggcaaaa ttaattaatt gacttttagc 5400 ccaaatttac attgttaatt aaatcaagaa ggaagaagat ctaagagctc ccattgatag 5460 gcaagcctag agagaactag ctaaatttat catgctagga tattgaaaca cagaaagttt 5520 acatacattt atgaagggtc aatttagttt ggacagtgag gtatttgtct tagtggaaaa 5580 aaggagaatt agtctgatca aatcgtgaag taatacagtg aacttgcagg tgcacaagta 5640 cctcttggac ttctgaattg atccagttgt catccaccac agacatctca catcagatac 5700 agacagttcc aagattgaca acagagaaca acctgctgga aagacctggg cagaaatgga 5760 gagccctgcg ggaaccatgc tacattttca tctaaagaga gaatgcacat ctgatgagac 5820 tgaaagttct ttgttgtttt agattgtaga atggtattga attggtctgt ggaaaattgc 5880 attgctttta tttctttgtg taatcaagtt taagtaatag gggatatata atcataagca 5940 ttttagggtg ggagggacta ttaagtaatt ttaagtgggt ggggttattt agaatgttag 6000 aataatatta tgtattagat atcgctataa gtggacatgc gtacttactt gtaacccttt 6060 accctataat tgctatcctt aaagatttca aataaactcg gagggaactg cagggagacc 6120 aacttattta gagcgaattg gacatggata aaaaccccag tgggagaaag ttcaaaggtg 6180 attagattaa taatttaata gaggatgagt gacctctgat aaattactgc tagaatgaac 6240 ttgtcaatga tggatggtaa attttcatgg aagttataaa agtgataaat aaaaaccctt 6300 gcttttaccc ctgtcagtag ccctcctcct accactgaac cccattgccc ctacccctcc 6360 ttctaacttt attgctgtat tctcttcact ctatatttct ctctatttgc taatattgca 6420 ttgctgttac aataaaaatt caataaagat ttagtggtta agtgcaaa 6468 25 2863 DNA Homo sapiens 25 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt tgtcttagtg gaaaaaagga 1980 gaattagtct gatcaaatcg tgaagtaata cagtgaactt gcaggtgcac aaaataagag 2040 ggccacatct atatggtgca gtctggaatt ctgtttaagt ttgtaggtac ctcttggact 2100 tctgaattga tccagttgtc atccaccaca gacatctcac atcagataca gacagttcca 2160 agattgacaa cagagaacaa cctgctggaa agacctgggc agaaatggag agccctgcgg 2220 gaaccatgct acattttcat ctaaagagag aatgcacatc tgatgagact gaaagttctt 2280 tgttgtttta gattgtagaa tggtattgaa ttggtctgtg gaaaattgca ttgcttttat 2340 ttctttgtgt aatcaagttt aagtaatagg ggatatataa tcataagcat tttagggtgg 2400 gagggactat taagtaattt taagtgggtg gggttattta gaatgttaga ataatattat 2460 gtattagata tcgctataag tggacatgcg tacttacttg taacccttta ccctataatt 2520 gctatcctta aagatttcaa ataaactcgg agggaactgc agggagacca acttatttag 2580 agcgaattgg acatggataa aaaccccagt gggagaaagt tcaaaggtga ttagattaat 2640 aatttaatag aggatgagtg acctctgata aattactgct agaatgaact tgtcaatgat 2700 ggatggtaaa ttttcatgga agttataaaa gtgataaata tgcccctacc cctccttcta 2760 actttattgc tgtattctct tcactctata tttctctcta tttgctaata ttgcattgct 2820 gttacaataa aaattcaata aagatttagt ggttaagtgc aaa 2863 26 6414 DNA Homo sapiens 26 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttactttctc aagtatcaat tcatcctcca tacatttgaa 4800 ttcaagttgt tttttgtcaa atttacagtt gtcaattgat cttcaagctg cagggtgcct 4860 agaaatgggc cgttgtctgt agccctggca tgtgcacacg gacatttgcc accactgcaa 4920 gcaaaagtct ggagaagttc accaacgaca agaacgatta gggaaaatat gctgctgtgg 4980 gttaacaact cagaaagtcc ctgatccaca tttggctgtt tactaaagct tgtgattaac 5040 tttttggcag tgtgtactat gctctattgc tatatatgct atctataaat gtagatgtta 5100 aggataagta attctaaatt tattattcta tagttttgaa gtttggttaa gtttcctttc 5160 actcaattga tttattttgt tgttaatcaa atttatgtta attggatcct ttaaattttt 5220 tttggcattt tccaacaaaa atggctttat tcataagaaa ggaaaaaaat caatggaatt 5280 tgatatctaa agaagttaga aagggagcaa aataaaaaac ataaaggaga tagatgaatt 5340 agtaagcaaa tcagtagtcg agtttttcaa actggcaaaa ttaattaatt gacttttagc 5400 ccaaatttac attgttaatt aaatcaagaa ggaagaagat ctaagagctc ccattgatag 5460 gcaagcctag agagaactag ctaaatttat catgctagga tattgaaaca cagaaagttt 5520 acatacattt atgaagggtc aatttagttt ggacagtgag gtatttgtct tagtggaaaa 5580 aaggagaatt agtctgatca aatcgtgaag taatacagtg aacttgcagg tgcacaagta 5640 cctcttggac ttctgaattg atccagttgt catccaccac agacatctca catcagatac 5700 agacagttcc aagattgaca acagagaaca acctgctgga aagacctggg cagaaatgga 5760 gagccctgcg ggaaccatgc tacattttca tctaaagaga gaatgcacat ctgatgagac 5820 tgaaagttct ttgttgtttt agattgtaga atggtattga attggtctgt ggaaaattgc 5880 attgctttta tttctttgtg taatcaagtt taagtaatag gggatatata atcataagca 5940 ttttagggtg ggagggacta ttaagtaatt ttaagtgggt ggggttattt agaatgttag 6000 aataatatta tgtattagat atcgctataa gtggacatgc gtacttactt gtaacccttt 6060 accctataat tgctatcctt aaagatttca aataaactcg gagggaactg cagggagacc 6120 aacttattta gagcgaattg gacatggata aaaaccccag tgggagaaag ttcaaaggtg 6180 attagattaa taatttaata gaggatgagt gacctctgat aaattactgc tagaatgaac 6240 ttgtcaatga tggatggtaa attttcatgg aagttataaa agtgataaat atgcccctac 6300 ccctccttct aactttattg ctgtattctc ttcactctat atttctctct atttgctaat 6360 attgcattgc tgttacaata aaaattcaat aaagatttag tggttaagtg caaa 6414 27 5420 DNA Homo sapiens 27 gtacacgcgc ttcaacttcg gttggtgtgt gtcgaagaaa cctgactgcg ccctgaggag 60 aacagcggag aaggtccacc gagcctggcg aaaggtccgc tgagcgggct gtcgtccgga 120 gccactccgg gctgcggagc acccagtgga gaccgcgcct ggctcaggtg tgggacccca 180 tccttcctgt cttcgcagag gagtcctcgc gtgaaataag cgggttttga aaacaaaaaa 240 aagaaggagt ggaagagggg gccaggatcc aggcctccat ccccacagaa gtgaagctac 300 agctgggagg tctcctccca ccccaaccgt caccctgggt cccgactgcc cacctcctcc 360 tcctccccct ccccccaaca acaacaacaa caacaactcc aagcacaccg gccataagag 420 tgcgtgtgtc cccaacatga ccgaacgaag aagggacgag ctctctgaag agatcaacaa 480 cttaagagag aaggtcatga agcagtcgga ggagaacaac aacctgcaga gccaggtgca 540 gaagctcaca gaggagaaca ccacccttcg agagcaagtg gaacccaccc ctgaggatga 600 ggatgatgac atcgagctcc gcggtgctgc agcagctgct gccccacccc ctccaataga 660 ggaagagtgc ccagaagacc tcccagagaa gttcgatggc aacccagaca tgctggctcc 720 tttcatggcc cagtgccaga tcttcatgga aaagagcacc agggatttct cagttgatcg 780 tgtccgtgtc tgcttcgtga caagcatgat gaccggccgt gctgcccgtt gggcctcagc 840 aaagctggag cgctcccact acctgatgca caactaccca gctttcatga tggaaatgaa 900 gcatgtcttt gaagaccctc agaggcgaga ggttgccaaa cgcaagatca gacgcctgcg 960 ccaaggcatg gggtctgtca tcgactactc caatgctttc cagatgattg cccaggacct 1020 ggattggaac gagcctgcgc tgattgacca gtaccacgag ggcctcagcg accacattca 1080 ggaggagctc tcccacctcg aggtcgccaa gtcgctgtct gctctgattg ggcagtgcat 1140 tcacattgag agaaggctgg ccagggctgc tgcagctcgc aagccacgct cgccaccccg 1200 ggcgctggtg ttgcctcaca ttgcaagcca ccaccaggta gatccaaccg agccggtggg 1260 aggtgcccgc atgcgcctga cgcaggaaga aaaagaaaga cgcagaaagc tgaacctgtg 1320 cctctactgt ggaacaggag gtcactacgc tgacaattgt cctgccaagg cctcaaagtc 1380 ttcgccggcg ggaaactccc cggccccgct gtagagggac cttcagcgac cgggccagaa 1440 ataataaggt ccccacaaga tgatgcctca tctccacact tgcaagtgat gctccagatt 1500 catcttccgg gcagacacac cctgttcgtc cgagccatga tcgattctgg tgcttctggc 1560 aacttcattg atcacgaata tgttgctcaa aatggaattc ctctaagaat caaggactgg 1620 ccaatacttg tggaagcaat tgatgggcgc cccatagcat cgggcccagt tgtccacgaa 1680 actcacgacc tgatagttga cctgggagat caccgagagg tgctgtcatt tgatgtgact 1740 cagtctccat tcttccctgt cgtcctaggg gttcgctggc tgagcacaca tgatcccaat 1800 atcacatgga gcactcgatc tatcgtcttt gattctgaat actgccgcta ccactgccgg 1860 atgtattctc caataccacc atcgctccca ccaccagcac cacaaccgcc actctattat 1920 ccagtagatg gatacagagt ttaccaacca gtgaggtatt actatgtcca gaatgtgtac 1980 actccagtag atgagcacgt ctacccagat caccgcctgg ttgaccctca catagaaatg 2040 atacctggag cacacagtat tcccagtgga catgtgtatt cactgtccga acctgaaatg 2100 gcagctcttc gagattttgt ggcaagaaat gtaaaagatg ggctaattac tccaacgatt 2160 gcacctaatg gagcccaagt tctccaggtg aagagggggt ggaaactgca agtttcttat 2220 gattgccgag ctccaaacaa ttttactatc cagaatcagt atcctcgcct atctattcca 2280 aatttagaag accaagcaca cctggcaacg tacactgaat tcgtacctca aatacctgga 2340 taccaaacat accccacata tgccgcgtac ccgacctacc cagtaggatt cgcctggtac 2400 ccagtgggac gagacggaca aggaagatca ctatatgtac ctgtgatgat cacttggaat 2460 ccacactggt accgccagcc tccggtacca cagtacccgc cgccacagcc gccgcctcca 2520 ccaccaccac cgccgccgcc tccatcttac agtaccctgt aaatacctgt catgtccttc 2580 aggatctctg ccctcaaaat ttattcctgt tcagcttctc aatcagtgac tgtgtgctaa 2640 attttaggct actgtatctt caggccacct gaggcacatc ctctctgaaa cggctatgga 2700 aggttagggc cactctggac tggcacacat cctaaagcac caaaagacct tcaacatttt 2760 ctgagagcaa cagagtattt gccaataaat gatctctcat ttttccacct tgactgccaa 2820 tctaactaaa ataattaata agtttacttt ccagccagtc ctggaagtct gggttttacc 2880 tgccaaaacc tccatcacca tctaaattat aggctgccaa atttgctgtt taacatttac 2940 agagaagctg atacaaacgc aggaaatgct gatttcttta tggaggggga gacgaggagg 3000 aggaggacat gacttttctt gcggtttcgg taccctcttt ttaaatcact ggaggactga 3060 ggccttatta aggaagccaa aattatcggt gcagtgtgga aaggcttccg tgatcctctc 3120 gctgcaccct tagaaacttc accgtcttca aactccattt ccatggttct gttaattctc 3180 aaggagcagc aactcgactg gttctcccag gagcaggaaa aacccttgtg acatgaaaca 3240 tctcaggcct gaaaagaaag tgctctctca gatggactct tgcatgttaa gactatgtct 3300 tcacatcatg gtgcaaatca catgtaccca atgactccgg ctttgacaca acaccttacc 3360 atcatcatgc catgatggct tccacaaagc attaaacctg gtaaccagag attactggtg 3420 gctccagcgt tgttagatgt tcatgaaatg tgaccacctc tcaatcacct ttgagggcta 3480 aagagtagca catcaaaagg actccaaaat cccataccca actcttaaga gatttgtcct 3540 ggtacttcag aaagaatttt catgagtgtt cttaattggc tggaaaagca ccagctgacg 3600 ttttggaaga atctatccat gtgtctgcct ccatatgcat ctgggcattt catcttcagt 3660 cccctcatta gactgtagca ttaggatgtg tggagagagg agaaatgatt tagcacccag 3720 attcacactc ctatgcctgg aagggggaca tctttgaaga agaggaatta gggctgtgga 3780 cactgtcttg aggatgtgga cttccttagt gagctccaca ttacttgatg gtaaccactt 3840 caaaaggatc agaatccacg taatgaaaaa ggtccctcta gaggatggag ctgatgtgaa 3900 gctgccaatg gatgaaaagc ctcagaaagc aactcaaagg actcaaagca acggacaaca 3960 caagagttgt cttcagccca gtgacacctc tgatgtcccc tggaagcttt gtgctaacct 4020 gggactgcct gacttccttt agcctggtcc cttgctacta ccttgaactg ttttatctaa 4080 cctctctttt tctgtttaat tctttgctac tgccattgac cctgctgcag gatttgtgtc 4140 attttcctgc ctggttgctg agactccatt ttgctgccac acacagagat gtaagaggca 4200 ggctttaatt gccaaagcac agtttgagca gtagaaaaca acatggtgta tatctcaaat 4260 tgcctgacat gaagaggagt ctaacggtga agtttcactt ttcatcagca tcatctttca 4320 catgttcatt atcatccgct cttattcttg catgtttaaa cacttaaaat ttttagtata 4380 atttttagtg tgttttgaag tggtgactag gctttcaaaa acttccattg aattacaaag 4440 cactatccag ttcttattgt taaactaagt aaaaatgata agtaacatag tgtaaaatat 4500 tcctttactg tgaacttctt acaatgctgt gaatgagagg ctcctcagaa ctggagcatt 4560 tgtataataa ttcatcctgt tcatcttcaa ttttaacatc atatataatt tcaattctat 4620 caattgggcc tttaaaaatc atataaaagg atataaaatt tgaaaagaga aacctaattg 4680 gctatttaat ccaaaacaac tttttttttt ccttcaatgg aatcagaaag cttgtcaatc 4740 actcatgtgt tttagagtaa ttactttctc aagtatcaat tcatcctcca tacatttgaa 4800 ttcaagttgt tttttgtcaa atttacagtt gtcaattgat cttcaagctg cagggtgcct 4860 agaaatgggc cgttgtctgt agccctggca tgtgcacacg gacatttgcc accactgcaa 4920 gcaaaagtct ggagaagttc accaacgaca agaacgatta gggaaaatat gctgctgtgg 4980 gttaacaact cagaaagtcc ctgatccaca tttggctgtt tactaaagct tgtgattaac 5040 tttttggcag tgtgtactat gctctattgc tatatatgct atctataaat gtagatgtta 5100 aggataagta attctaaatt tattattcta tagttttgaa gtttggttaa gtttcctttc 5160 actcaattga tttattttgt tgttaatcaa atttatgtta attggatcct ttaaattttt 5220 tttggcattt tccaacaaaa atggctttat tcataagaaa ggaaaaaaat caatggaatt 5280 tgatatctaa agaagttaga aagggagcaa aataaaaaac ataaaggaga tagatgaatt 5340 agtaagcaaa tcagtagtcg agtttttcaa actggcaaaa ttaattaatt gacttttagc 5400 ccaaatttac attgttaatt 5420 28 325 PRT Homo sapiens 28 Met Thr Glu Arg Arg Arg Asp Glu Leu Ser Glu Glu Ile Asn Asn Leu 1 5 10 15 Arg Glu Lys Val Met Lys Gln Ser Glu Glu Asn Asn Asn Leu Gln Ser 20 25 30 Gln Val Gln Lys Leu Thr Glu Glu Asn Thr Thr Leu Arg Glu Gln Val 35 40 45 Glu Pro Thr Pro Glu Asp Glu Asp Asp Asp Ile Glu Leu Arg Gly Ala 50 55 60 Ala Ala Ala Ala Ala Pro Pro Pro Pro Ile Glu Glu Glu Cys Pro Glu 65 70 75 80 Asp Leu Pro Glu Lys Phe Asp Gly Asn Pro Asp Met Leu Ala Pro Phe 85 90 95 Met Ala Gln Cys Gln Ile Phe Met Glu Lys Ser Thr Arg Asp Phe Ser 100 105 110 Val Asp Arg Val Arg Val Cys Phe Val Thr Ser Met Met Thr Gly Arg 115 120 125 Ala Ala Arg Trp Ala Ser Ala Lys Leu Glu Arg Ser His Tyr Leu Met 130 135 140 His Asn Tyr Pro Ala Phe Met Met Glu Met Lys His Val Phe Glu Asp 145 150 155 160 Pro Gln Arg Arg Glu Val Ala Lys Arg Lys Ile Arg Arg Leu Arg Gln 165 170 175 Gly Met Gly Ser Val Ile Asp Tyr Ser Asn Ala Phe Gln Met Ile Ala 180 185 190 Gln Asp Leu Asp Trp Asn Glu Pro Ala Leu Ile Asp Gln Tyr His Glu 195 200 205 Gly Leu Ser Asp His Ile Gln Glu Glu Leu Ser His Leu Glu Val Ala 210 215 220 Lys Ser Leu Ser Ala Leu Ile Gly Gln Cys Ile His Ile Glu Arg Arg 225 230 235 240 Leu Ala Arg Ala Ala Ala Ala Arg Lys Pro Arg Ser Pro Pro Arg Ala 245 250 255 Leu Val Leu Pro His Ile Ala Ser His His Gln Val Asp Pro Thr Glu 260 265 270 Pro Val Gly Gly Ala Arg Met Arg Leu Thr Gln Glu Glu Lys Glu Arg 275 280 285 Arg Arg Lys Leu Asn Leu Cys Leu Tyr Cys Gly Thr Gly Gly His Tyr 290 295 300 Ala Asp Asn Cys Pro Ala Lys Ala Ser Lys Ser Ser Pro Ala Gly Asn 305 310 315 320 Ser Pro Ala Pro Leu 325 29 708 PRT Homo sapiens 29 Met Thr Glu Arg Arg Arg Asp Glu Leu Ser Glu Glu Ile Asn Asn Leu 1 5 10 15 Arg Glu Lys Val Met Lys Gln Ser Glu Glu Asn Asn Asn Leu Gln Ser 20 25 30 Gln Val Gln Lys Leu Thr Glu Glu Asn Thr Thr Leu Arg Glu Gln Val 35 40 45 Glu Pro Thr Pro Glu Asp Glu Asp Asp Asp Ile Glu Leu Arg Gly Ala 50 55 60 Ala Ala Ala Ala Ala Pro Pro Pro Pro Ile Glu Glu Glu Cys Pro Glu 65 70 75 80 Asp Leu Pro Glu Lys Phe Asp Gly Asn Pro Asp Met Leu Ala Pro Phe 85 90 95 Met Ala Gln Cys Gln Ile Phe Met Glu Lys Ser Thr Arg Asp Phe Ser 100 105 110 Val Asp Arg Val Arg Val Cys Phe Val Thr Ser Met Met Thr Gly Arg 115 120 125 Ala Ala Arg Trp Ala Ser Ala Lys Leu Glu Arg Ser His Tyr Leu Met 130 135 140 His Asn Tyr Pro Ala Phe Met Met Glu Met Lys His Val Phe Glu Asp 145 150 155 160 Pro Gln Arg Arg Glu Val Ala Lys Arg Lys Ile Arg Arg Leu Arg Gln 165 170 175 Gly Met Gly Ser Val Ile Asp Tyr Ser Asn Ala Phe Gln Met Ile Ala 180 185 190 Gln Asp Leu Asp Trp Asn Glu Pro Ala Leu Ile Asp Gln Tyr His Glu 195 200 205 Gly Leu Ser Asp His Ile Gln Glu Glu Leu Ser His Leu Glu Val Ala 210 215 220 Lys Ser Leu Ser Ala Leu Ile Gly Gln Cys Ile His Ile Glu Arg Arg 225 230 235 240 Leu Ala Arg Ala Ala Ala Ala Arg Lys Pro Arg Ser Pro Pro Arg Ala 245 250 255 Leu Val Leu Pro His Ile Ala Ser His His Gln Val Asp Pro Thr Glu 260 265 270 Pro Val Gly Gly Ala Arg Met Arg Leu Thr Gln Glu Glu Lys Glu Arg 275 280 285 Arg Arg Lys Leu Asn Leu Cys Leu Tyr Cys Gly Thr Gly Gly His Tyr 290 295 300 Ala Asp Asn Cys Pro Ala Lys Ala Ser Lys Ser Ser Pro Ala Gly Asn 305 310 315 320 Ser Pro Ala Pro Leu Tyr Glu Gly Pro Ser Ala Thr Gly Pro Glu Ile 325 330 335 Ile Arg Ser Pro Gln Asp Asp Ala Ser Ser Pro His Leu Gln Val Met 340 345 350 Leu Gln Ile His Leu Pro Gly Arg His Thr Leu Phe Val Arg Ala Met 355 360 365 Ile Asp Ser Gly Ala Ser Gly Asn Phe Ile Asp His Glu Tyr Val Ala 370 375 380 Gln Asn Gly Ile Pro Leu Arg Ile Lys Asp Trp Pro Ile Leu Val Glu 385 390 395 400 Ala Ile Asp Gly Arg Pro Ile Ala Ser Gly Pro Val Val His Glu Thr 405 410 415 His Asp Leu Ile Val Asp Leu Gly Asp His Arg Glu Val Leu Ser Phe 420 425 430 Asp Val Thr Gln Ser Pro Phe Phe Pro Val Val Leu Gly Val Arg Trp 435 440 445 Leu Ser Thr His Asp Pro Asn Ile Thr Trp Ser Thr Arg Ser Ile Val 450 455 460 Phe Asp Ser Glu Tyr Cys Arg Tyr His Cys Arg Met Tyr Ser Pro Ile 465 470 475 480 Pro Pro Ser Leu Pro Pro Pro Ala Pro Gln Pro Pro Leu Tyr Tyr Pro 485 490 495 Val Asp Gly Tyr Arg Val Tyr Gln Pro Val Arg Tyr Tyr Tyr Val Gln 500 505 510 Asn Val Tyr Thr Pro Val Asp Glu His Val Tyr Pro Asp His Arg Leu 515 520 525 Val Asp Pro His Ile Glu Met Ile Pro Gly Ala His Ser Ile Pro Ser 530 535 540 Gly His Val Tyr Ser Leu Ser Glu Pro Glu Met Ala Ala Leu Arg Asp 545 550 555 560 Phe Val Ala Arg Asn Val Lys Asp Gly Leu Ile Thr Pro Thr Ile Ala 565 570 575 Pro Asn Gly Ala Gln Val Leu Gln Val Lys Arg Gly Trp Lys Leu Gln 580 585 590 Val Ser Tyr Asp Cys Arg Ala Pro Asn Asn Phe Thr Ile Gln Asn Gln 595 600 605 Tyr Pro Arg Leu Ser Ile Pro Asn Leu Glu Asp Gln Ala His Leu Ala 610 615 620 Thr Tyr Thr Glu Phe Val Pro Gln Ile Pro Gly Tyr Gln Thr Tyr Pro 625 630 635 640 Thr Tyr Ala Ala Tyr Pro Thr Tyr Pro Val Gly Phe Ala Trp Tyr Pro 645 650 655 Val Gly Arg Asp Gly Gln Gly Arg Ser Leu Tyr Val Pro Val Met Ile 660 665 670 Thr Trp Asn Pro His Trp Tyr Arg Gln Pro Pro Val Pro Gln Tyr Pro 675 680 685 Pro Pro Gln Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Ser 690 695 700 Tyr Ser Thr Leu 705 

What is claimed is:
 1. A process for identifying an agent that modulates the activity of a cancer-related gene comprising: (a) contacting a compound with a cell containing a gene that corresponds to a polynucleotide having a sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27 and under conditions promoting the expression of said gene; and (b) detecting a difference in expression of said gene relative to when said compound is not present thereby identifying an agent that modulates the activity of a cancer-related gene.
 2. The process of claim 1 wherein said gene has a sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27.
 3. The process of claim 1 or 2 wherein the cell is a cancer cell and the difference in expression is a decrease in expression.
 4. The process of claim 3 wherein said cancer cell is a breast or liver cancer cell.
 5. A process for identifying an anti-neoplastic agent comprising contacting a cell exhibiting neoplastic activity with a compound first identified as a cancer related gene modulator using a process of one of claims 1-4 and detecting a decrease in said neoplastic activity after said contacting compared to when said contacting does not occur.
 6. The process of claim 5 wherein said neoplastic activity is accelerated cellular replication.
 7. The process of claim 5 wherein said decrease in neoplastic activity results from the death of the cell.
 8. A process for identifying an anti-neoplastic agent comprising administering to an animal exhibiting a cancer condition an effective amount of an agent first identified according to a process of claim 1 and detecting a decrease in said cancerous condition.
 9. A process for determining the cancerous status of a cell, comprising determining an increase in the level of expression in said cell of a gene that corresponds to a polynucleotide having a sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27 wherein an elevated expression relative to a known non-cancerous cell indicates a cancerous state or potentially cancerous state.
 10. The process of claim 9 wherein said elevated expression is due to an increased copy number.
 11. An isolated polypeptide comprising an amino acid sequence homologous to an amino acid sequence selected from the group consisting of the polypeptide sequences of SEQ ID NO: 28-29 wherein any difference between said amino acid sequence and the sequence of polypeptide sequences of SEQ ID NO: 28-29 is due solely to conservative amino acid substitutions and wherein said isolated polypeptide comprises at least one immunogenic fragment.
 12. An isolated polypeptide comprising an amino acid sequence selected from the group consisting of the polypeptide sequences of SEQ ID NO: 28-29.
 13. An antibody that reacts with a polypeptide comprising an amino acid sequence selected from the group consisting of polypeptide sequences of SEQ ID NO: 28-29.
 14. The antibody of claim 13 wherein said antibody is a recombinant antibody.
 15. The antibody of claim 13 wherein said antibody is a synthetic antibody.
 16. The antibody of claim 13 wherein said antibody is a humanized antibody.
 17. An immunoconjugate comprising the antibody of claim 13 and a cytotoxic agent.
 18. The antibody of claim 17 wherein said cytotoxic agent is a member selected from the group consisting of a calicheamicin, a maytansinoid, an adozelesin, a cytotoxic protein, a taxol, a taxotere, a taxoid and DC1.
 19. The immunoconjugate of claim 18 wherein said calicheamicin is calicheamicin γ₁ ^(l), N-acetyl gamma calicheamicin dimethyl hydrazide or calicheamicin θ₁ ^(l).
 20. The immunoconjugate of claim 18 wherein said maytansinoid is DM1.
 21. The immunoconjugate of claim 18 wherein said cytotoxic protein is ricin, abrin, gelonin, pseudomonas exotoxin or diphtheria toxin.
 22. The immunoconjugate of claim 18 wherein said taxol is paclitaxel.
 23. The immunoconjugate of claim 18 wherein said taxotere is docetaxel.
 24. A process for treating cancer comprising contacting a cancerous cell in vivo with an agent having activity against an expression product encoded by a gene sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27.
 25. The process of claim 24 wherein said agent is an antibody of claim 13-16.
 26. The process of claim 24 wherein said agent is an immunoconjugate of claim
 17. 27. An immunogenic composition comprising a polypeptide of claim
 11. 28. An immunogenic composition comprising a polypeptide of claim
 12. 29. The process of claim 24 wherein said cancer is breast or liver cancer.
 30. A process for treating cancer in an animal afflicted therewith comprising administering to said animal an amount of an immunogenic composition of claim 27 sufficient to elicit the production of cytotoxic T lymphocytes specific for the polypeptide of claim
 11. 31. A process for treating cancer in an animal afflicted therewith comprising administering to said animal an amount of an immunogenic composition of claim 28 sufficient to elicit the production of cytotoxic T lymphocytes specific for the polypeptide of claim
 12. 32. A process for treating a cancerous condition in an animal afflicted therewith comprising administering to said animal a therapeutically effective amount of an agent first identified as having anti-neoplastic activity using the process of claim
 8. 33. A process for protecting an animal against cancer comprising administering to an animal at risk of developing cancer a therapeutically effective amount of an agent first identified as having anti-neoplastic activity using the process of claim
 8. 34. The process of claim 30, 31, 32 or 33 wherein said animal is a human being.
 35. The process of claim 30, 31, 32 or 33 wherein said cancer is breast or liver cancer.
 36. A method for producing test data with respect to the gene modulating activity of a compound comprising: (a) contacting a compound with a cell containing a gene that corresponds to a polynucleotide having a sequence selected from the group consisting of polynucleotide sequences of SEQ ID NO: 1-27 and under conditions promoting the expression of said gene; and (b) detecting a difference in expression of said gene relative to when said compound is not present (c) and producing test data with respect to the gene modulating activity of said compound based on a decrease in the expression of the determined genes that correspond to SEQ ID NO: 1-27 indicating gene modulating action. 